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Abstract. In a previous paper |2(J in this series, we gave LP estimates for multi-linear 
operators given by multipliers which are singular on a non-degenerate subspace of some 
dimension k. In this paper we give uniform estimates when the subspace approaches a 
degenerate region in the case k = 1, and when all the exponents p are between 2 and 
00. In particular we recover the non-endpoint uniform estimates for the Bilinear Hilbert 
P"^ ■ transform in ||12| ■ 

•5 

1. Introduction 

We are concerned with n-linear forms mapping n Schwartz functions on the real line 

to a complex number. We shall assume these forms are invariant under a simultaneous 

q\ [ translation of all n functions. Dually, these forms can be viewed as operators mapping 

n — 1 functions to a distribution. Our goal is to prove L p regularity for these forms and 

operators. 

By the Schwartz kernel theorem, we can identify such an n-linear form with a distribu- 
tion in R n . Translation invariance implies that the Fourier transform of this distribution 
lives on the hyperplane 



O 

Since we are interested in L p regularity, we may restrict attention to the case that this 
distribution is a function m(£i, . . . , £ n ) on this hyperplane. 
k>( | Then the associated multi-linear form A := A m is given by 

. M : A m (/i, ...,/«):= J <Kfi + • • • + £«M£)A(£i) • • • /«(£») # 

where £ =: (£i, . . . , £ n ) and 5 denotes the Dirac delta. Dually, the associated multi-linear 
operator T := T m is given by 

T m (fl, .-., /n-l)(-£n) := f 5(6 + • • • + e„)m(0/l(6) • • • /n-l(£n-l) <%1 • • ■ d£n-l, 

J (1) 

the relationship between T and A is given by 

A(/i, . . . , /„) = J T(A, . . . , / n _i)(x)/ n (x) dx. (2) 

Examples of such objects include the pointwise product operator 

Tl(fl, • • • , fn-l) '■= fl ■ ■ ■ fn-l, 

which occurs when the multiplier m is identically 1. Operators from classical paraproduct 
theory studied in @-[0, fL6| , |1^| also fall into this category and are given by multipliers 

l 
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satisfying symbol estimates as in (||) below with r" = {0}. When n = 2 these operators 
are just linear Fourier multipliers. More recently, multipliers satisfying @ with nontriv- 
ial subspaces T' have been studied. Observe that @ is invariant under translations in 
direction of r", so the class of multipliers satisfying this condition is translation invariant. 
Taking the Fourier transform we obtain a class of forms and operators which has modu- 
lation symmetry. In case the multiplier is invariant under translations in direction of V, 
the operator itself has a modulation symmetry. Hence the title of this paper. Such forms 
and operators have been discussed in p0[ , |T0|| (also p5fl , [p3| , p2f , [|], |TT|). 



Given a multiplier m, the main interest in the subject is to obtain L p estimates for A m 
of the form 

n 

iA m (A,...,/„)i<cnii*« ^ 

when 1 < pi < oo. By duality this is equivalent to T m having the mapping property 

T m : L P1 x . . . x L^- 1 -> L p ™ 

on test functions, where p' is given by l/p + 1/p' := 1. When (^) obtains, we say that A m 
is of strong type (1/pi, . . . , 1/pn)- I n the dual formulation for T m one may also consider 
the case p n ' < 1, but we shall not do so in this paper. 

In the pointwise product case m = 1 one has strong type whenever 1 < pi < oo and 
one has the scaling condition 



n 1 



i» 



1. (4) 



These estimates (with the exception of some of the endpoint estimates) also generalize to 
paraproducts and the bilinear Hilbert transform. We cite the following multiplier theorem, 
provenQ in 



Theorem 1.1. [20 Let V be a subspace ofT of dimension k where 

0<k< n/2. (5) 

Assume that T' is non- degenerate in the sense that for every 1 < %\ < i^ < , , , ik < n, the 
space V is a graph over the variables £i lf . . . ,£i h . Suppose that m satisfies the estimates 

\d%m(£)\ <Cdist(e,r')" |a| (6) 

for all partial derivatives Of on hyperplane up to some sufficiently large finite order. Then 
(0) holds whenever (|]) holds and 1 < Pi < oo for all i = 1, . . . ,n. 

Following up on the work begun in p6j , we consider in this paper the problem of 



obtaining uniform estimates when the subspace V becomes increasingly degenerate. In 
order to do so one must modify the condition (||). To illustrate this, suppose that n — 3, 
k — 1, and r" has degenerated completely to 

r' = {(6,6,6) er:£i = o}. 



Actually, a more general theorem was proven in J20[ in which some 1/pi were allowed to be zero or 
negative, but we will not discuss this further here. Also, the n = 3 case of Theorem |l.l| was first proven 
in fill, & 
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(This space is a graph over the £2 or £3 variables, but not over £1). Let m denote the 
multiplier 

ro(£i,&,6):=V'(£i)a(6) 
where ip is a bump function on [1, 2] and a is any smooth function satisfying the estimates 
||<9|a||oo < Ck for all integers k. This multiplier m satisfies (|6]) for the given degenerate 
subspace T'. Then one has 

Am(hj2js)=jT4f 1 )T a (f 2 )f 3 

where T m is the Fourier multiplier corresponding to m. This operator can be quite badly 
behaved because a need not be an L p multiplier for any p ^ 2. Indeed, it is easy to 
construct examples which show unboundedness of this operator whenever p 2 > 2 > p 3 or 
p 2 < 2 < p 3 . 

The main result of this paper is the following k = 1 result. 

Theorem 1.2. Letn > 3, and letV = span(v) be a one-dimensional subspace ofT, where 
v = (vi, . . . ,v n ) and V\, . . . ,v n are non-zero real numbers which sum to zero. Define the 
metric d v (x,y) on T by 

\-Ei Vi I 



and write 



d v (x,y) := sup 

Ki<n 



d v (x,T') := inf d v (x,y). 



Suppose that m satisfies the estimates 

n 



j=i 



for all partial derivatives df on T up to some finite order. Then (|3|) holds whenever (||) 
holds and 2 < pi < 00 for all i = 1, . . . , n, with the bounds uniform in the choice of Vi. 

We do not know how to modify (|7|) in the higher rank case k > 1, mainly because we 
have no natural analogue of the metric d v . In analogy with [25], [TB|] one expects that one 
should be able to go beyond the case 2 < pi < 00, but we do not pursue these matters 
here. Restraining ourselves to the case 2 < pi < 00 gives us some considerable technical 
simplification. 

In the non-degenerate case, when all the Vi have comparable magnitudes, this theorem 
is a corollary of Theorem |1.1| . A more careful examination of the proof of this Theorem 
in pOj would reveal that the constant given for (|3|) would grow polynomially in the ratio 
between the largest and smallest magnitudes of v j. 

A special case of this theorem occurs when n — 3, v — (/3 2 — P3, P3 — Pi, Pi — P2), for 
some distinct real numbers /3i,/3 2 ,/3 3 , and m(£ 1 ,£ 2 , £3) = sgn^^ + /3 2 £ 2 + P3^,3)- This 
corresponds to the (essentially one-parameter) family of bilinear Hilbert transforms 



3 
A m (fiJ2j3) = C J Jflf^x-P^jdx. 
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In [pjj, [18 1 the estimate (§) was proven for all 1 < pi < oo obeying (f|), but with the 



bound growing polynomially when two of the fy approached each other. In |26| the weak- 
type estimate T„ 



L 2 x U 



L l '°° was shown with uniform control on the bounds 
when (3\ or j3 2 approached j3 3 , but not when (3i approached /3 2 . (This weak-type bound 
is already sufficient to prove a version of Calderon's conjecture which is strong enough to 
recover the boundedness of the Cauchy integral on Lipschitz curves. See |26] for further 
discussion). More recently, (|3[) was established whenever 2 < pi < oo obeyed (^), with 
bounds uniform over all values of f3 in (TJJ . Thus Theorem [L2] is already known in this 
case. Also, estimates beyond the 2 < p t < oo case have been obtained in [|13[ . 

Our argument follows the standard approach to the modulation invariant setting of |HJ , 

0,0,0 



(see also 



0, n: 



We perform dyadic decompositions in both space 
and frequency, which has the effect of decomposing the phase plane into an overlapping 
set of tiles. We then split A m into pieces associated to various n-tuples of tiles. By using 
tree selection arguments and an orthogonality argument based on Bessel's inequality for 
tiles as in the above references, we can reduce matters to estimating the contribution of 
a single tree of tiles. After eliminating the spatial cutoffs, the problem now becomes that 
of obtaining uniform estimates for paraproducts. This estimate may be of independent 
interest, it has been shown in the prequel ]2T|] of this paper. 
We quote it in the form that is needed here: 



Proposition 1.3. Let n > 2, and let Mi, 



M„ be real numbers. For each 1 < i < n 



and j G Z ; let 7T,- j be a Fourier multiplier whose symbol is a bump function adapted to 



{£ : [CI < 2 j+M '}. Suppose that for each k G 
that the symbol of iij^ vanishes at the origin. 



Z there exists at least one 
Then one has the estimate 



E 

j 



/n n 

„■ — 1 A — 1 



1 < % < n such 



(9) 



for all 1 < pi < oo obeying (|]) ; where the constant C(p.^ >n depends on the pi, and n but is 
independent of the Mi and fi. 

Here we say a bump function is adapted to an interval / if it is supported in this interval 
and its k-th derivative is bounded by |/|~ fc for all k up to some sufficiently large power, 
which may depend on n and (p^. Proposition |1.3| may be viewed as a lacunary version of 
Theorem 172, in the same way that paraproduct estimates are a lacunary version of the 
0, 



results in [|T7 



2"0| and the theory of maximal truncated Hilbert transforms are a 



lacunary version of the Carleson-Hunt theorem. Indeed, it is possible to derive Proposition 
|1.3| as a special case of Theorem T72, at least in the 2 < pi < oo case. 

One of the main innovations in this paper lie in using phase space projections to obtain 
the tree estimate from Proposition 1.3 . Such phase plane projections were previously 



only known and utilized in the Walsh case |27|| . The fact that we are restricted to the 
2 < pi < oo case allows for some simplifications in the argument; most notably the 
argument works with a uniform spatial localization of all functions involved, unlike in 
26fl . Morover, one can replace all the fi by characteristic functions. Also, since all 
functions are locally in L? , one does not need to remove any exceptional set (which needs 
to be done for the Pi < 2 theory). 

The first author was partially supported by a Sloan Dissertation Fellowship. The second 
author is a Clay Prize Fellow and is supported by grants from the Sloan and Packard 
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NSF grants DMS 9985572 and DMS 9970469. The third author would like to thank the 
Mathematics Department of Arizona State University for their hospitality during a visit 
in which part of the work on this paper was done. 



2. Discretization in the frequency space 



The first step in the proof of Theorem |1.2| is a Whitney decomposition away from I"" of 



the multiplier m, which is defined as a function in frequency variables. The Whitney pieces 
will be smooth functions and compactly supported in the frequency variables. While we 
would like to think of these pieces as parameterized by dyadic intervals, it will not be 



possible to use the standard dyadic grid to parameterize them. As in 1 17], we will spend 
some effort in constructing a grid structure akin to the dyadic structure. We thought it 
desirable to not having to do this technical step, which involves a strong interaction of 
intervals in i- and %'- coordinates for i ^ i', but we have not been able to proceed without 
it. 

We fix n. All constants that follow (typically denoted by C) may depend on n. We 
also fix pi, , , . , p n as in Theorem |1.2] , and all constants are also allowed to depend on 
Pi, . . . ,p n . In particular, we choose a large N depending on these exponents, which will 
describe decay of functions in physical space. 

We shall need the constants Co = 2 100n and J = 2 C °. These measure the fineness of 
our Whitney decomposition and separation in scales respectively. All constants C can be 
viewed as dependent on Co and J. 

If Q is a box in R n with sides parallel to the axes, we use Qi, . . . , Q n to denote the 
intervals comprising Q, thus Q = Q\ x . . . x Q n . All our intervals and boxes will be closed, 
but when we say two intervals are disjoint we mean disjoint up to possibly points at the 
boundary. For A > 0, we denote by AQ the box with the same center as Q and A times 
the sidelength. 



In the non-degenerate case treated in p0[ , we used the standard Euclidean Whitney 



decomposition of R n \ I"" into cubes. This decomposition does not work well in the near- 
degenerate setting; following |26[ we shall instead decompose the multiplier into boxes 
which are adapted to the subspace V. To do this, it is convenient to perform a rescaling 
to convert the near-degenerate space I"" into a non-degenerate one, which we choose to be 
the diagonal 

f' = {(e,...,e):eeR}cR n . 

We shall use tildes to denote quantities that are defined in the rescaled setting. 

Fix v i, . . . ,v n . By symmetry we may assume \v\\ > ■ ■ • > \v n \. By rescaling we may 
assume \v n \ = 1. It is important that all our constants C will be independent of the vf, 
indeed, this is the whole point of this paper. We also define Mj and ir^ by 2 Mi — |%| and 
Jrrij = M^ 

Let L : R n — > R n be the linear transformation 

L[X\, • • • , x n ) := \V\X\, . . . , v n x n ). 
The space L _1 (r') is thus the diagonal I"" in R n . 
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One can easily verify from ([7|) that one can find a smooth function m : R n \r' — * R 
which agrees with mo L on L~ l {T) and which satisfies the symbol estimates 

|^m(0|<Cdist(£,fr |Q| 

for all a up to some large finite order N 5 , where dist is now Euclidean distance. 

Let Q denote the set of all cubes Q which have side- length 2 J for some integer j, whose 
center lies in the lattice 2 J_10 Z n , and which obey the Whitney conditions 

C Q n P = (10) 

4C o gnP/0. (ii) 

The sets -^Q form a finitely overlapping cover of R n \r', and so one may decompose 

m = C^2rhQ 

QgQ 

where each frig is a bump function adapted to \Q with degree of regularity N 4 . I.e., rhq 
is supported in \Q and satisfies 

|#£Wq(0I <diam(Q)-l Q l 
for all | a | < N 4 . It thus suffices to show that 

/n 
<K6 + . . . + ^)m (L- l (0)A(6) • • • UQ del < c n 11/ilU- 

QeQ i=l 

For each Q 6 Q, we may use a Fourier series and the smoothness of rhn to decompose 
ifiQ into tensor products 

n 

^^E( 1+ W)" 10n ll%^) 

fcez n «=i 

where the m^ 4 fc are bump functions adapted to the interval Qi with degree of regularity 
N 3 uniformly in k. It thus suffices to show that 



/n n 

s^x + .-. + QU^q^M 1 ^^) #1 <cf[ll/*ll» 

n-o i=l i=l 



( 12 ) 
QeQ' 

for each fc. We fix fc once and for all and for notational convenience drop the index k, i.e., 
write friQ . instead of rhg i k . 

Next, we will make the collection of cubes sparser. 

Definition 2.1. We call a collection Q'C Q sparse, if we have for any Q,Q' G Q' and 
any 1 < i < n the following properties 

\Qi\ < \Q'i\ => \Qi\ < 2 J \Q[\ , (13) 

\Qi\ = \Q'i\, Qi^Q'i => dist(4,QD > 2 J \Qi\ , (14) 

Q l = Q' i => Q = Q' . (15) 
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Thanks to the fact that at given size the cubes in Q form essentially a one dimensional 
set in direction of the diagonal, we may decompose Q into a bounded number of sparse 
sets. Thus it suffices to prove (p~2[ ) under the assumption that Q is now a sparse subset 
of the original Q. For convenience of notation we shall call this sparse set again Q. By a 
standard limiting argument, we can also assume that Q is finite, as long as the estimates 
do not depend on the set Q. 

We now begin to introduce a structure akin to a dyadic grid. We shall be interested 
in the enlarged cubes 1000Q. For each interval lOOOQi we shall define a slightly (by at 
most one percent on either side) increased interval Q t D lOOOQi so that these increased 
intervals have good dyadic properties as formulated in Lemma |2.2f 

By ( |13D the possible sizes of cubes in Q come in discrete quantities, each separated by 
a large factor. We shall refer to these different sizes as different scales. 

We shall define the intervals Q { successively beginning with the smallest scale in Q. 
For Q e Q of the smallest scale we simply set Q i = lOOOQi. 

Observe that the distance between Qi and Qj for i ^ j is less than 8C \Qi\, because 
the cartesian product of 4C Qi and 4C Q? contains a point on the diagonal by ([TT|). Thus 
the diameter of the convex hull h(Q) of all 10Q i; 1 < i < n is less than 10C |Qi|- If Q' is 
another cube of the smallest scale in Q, then the convex hull h(Q) has distance at least 
J\Qi\ from h(Q ) by ([14]). Hence every interval of length 20Co|Qi| contains at least one 
interval of length \Qi\ which does not intersect h(Q ) for any Q' at the smallest scale. 

Now consider a cube Q at the second smallest scale. We may increase each interval 
lOOOQi by at most 1 percent on either side, thus obtaining an interval Q { , so that for any 
cube Q' at the smallest scale we either have h(Q ) C Q i or h(Q ) n Q t = 0. All we need 
is that the endpoints of the enlarged interval are not contained in the convex hull h(Q ) 
for any Q' at the smallest scale, which can clearly be achieved from ( |T3D and the previous 
discussion. 

Then we may define the convex hull h(Q) as before, and the separation properties 
of these convex hulls discussed for the smallest scale also hold for cubes at the second 
smallest scale. 

By successively passing to larger scales, we can increase each interval lOOOQi by at 
most 1 percent on either side so that it contains either all of h{Q ) for any Q' at a smaller 
scale or is disjoint from h(Q ). Namely each interval of length |Qi| contains an interval 
of length comparable to the sidelength of cubes at the next smaller scale which does not 
intersect h(Q ) for any Q' at the next smaller scale, and so on passing to smaller and 
smaller scales, so we can find an endpoint for Q { which is not contained in h(Q ) for any 
Q' of any smaller scale. 

We can now formulate the good dyadic properties that the intervals Q { have: 

Lemma 2.2. Let Q,Q' e Q. If diam(Q) < diam(Q'), then lOQi D Q j ^ for some 
1 < i, j < n implies lOQ^ C Q ■ for all 1 < i' < n. 

Proof The proof is clear by construction. ■ 

Let Q denote the set of boxes 

Q:={L(Q) :QeQ}. 
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We let jq denote the rational number such that 2 J i Q = \Q n \, and refer to jq as the 
frequency parameter of Q. By ([13]) the set of jq is a discrete set so that any two elements 
have at least distance 1. For purely notational convenience we shall assume that jq is an 
integer for each Q G Q. This is only a special case of the general situation, in which one 
may assume the Jq belong to a fixed shift of the integer lattice, but the proof is the same 
in the general situation with some additional notation. 
For each Q G Q define tiq. to be the Fourier multiplier 

The notation tiq. is a bit sloppy: to be more precise one should write ttq^ or ttq^. However, 
the index i will always either appear in the subscript of it or otherwise be clear from the 
context. Note that the symbol of ttq- is a bump function adapted to Qi- 

By Plancherel, we can then rewrite the desired estimate in physical space as 

/n n 

nw,(*)cki<c7r[ii/<ii« ■ 

This completes our frequency space decomposition. Observe that we have returned to 
a notation that does not explicitly involve any quantities with tilde accent. We shall not 
need the tilde accent anymore to denote quantities under the transformation L~ 1 , and 
thus be free to use the tilde accent in other contexts. 

3. Discretization in the physical space 

For fixed Q, the projections ttq. are Fourier multipliers supported on intervals of length 
ranging from \Q n \ to 2 Ml \Q n \. The Heisenberg uncertainty principle then suggests that 
one needs to consider several spatial scales, from the coarse scale of IQnl" 1 to the fine 
scale of 2~ \Q n \~ l - This multiplicity of scales causes much technical difficulty in |26| 



T2|| . A key difference and simplification in our approach is that we only decompose in the 
coarsest scale |Q n | _1 , so that we will sometimes localize less than the uncertainty principle 
suggests. Unfortunately we have only been able to make this simplification work in the 
2 < p < oo case, which is why this paper is restricted to this range of exponents. 

In frequency space we have used compactly supported cutoff functions to decompose 
the multiplier. Consequently, we will continue to use truncations which have compact 
support in frequency space and merely satisfy rapid decay estimates in physical space. 
Becasue of this, we can use the standard dyadic grid to partition physical space as opposed 
to the carefully constructed grid we use for frequency space. An interval is called dyadic 
if it is of the form [2 k n, 2 k (n + 1)] with integers k and n. 

Following standard procedure, we shall index the space-frequency decomposition using 
tiles. 

Definition 3.1. Let 1 < i < n. An i-tile is a rectangle P = Ip x up with area 2 Mi , Ip a 
dyadic interval, and up an interval in the mesh {Qi : Q G Q}. A multi-tile is an n-tuple 
P = (Pi, . . . , P n ) such that each Pi is an i-tile, the interval Ip { = Ip is independent of i, 

and such that the frequency box Qp := Y\t=i Up i °f ^ ^ s an e ^ emen ^ °f Q- The frequency 
parameter jp of a multi-tile is defined by jp := jq^. In particular, we have \Ip\ = 2~ J ^p . 
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A multi-tile P' is called a translate of P, if Qp = Qp,. If P — Ip x up is an i-tile and 
up = Qi, we write Up for Q i . 

Since we assumed \Q n \ to be an integral power of 2 J , we also have that \Ip\ is an integral 
power of 2 J for all tiles P. 

Note that n-tiles have area 1. In the Walsh analogue |27| they correspond to one 
dimensional subspaces of L 2 (R), and will thus be easier to handle than the other tiles. 
For instance, we shall be able to obtain good L p bounds on these tiles in addition to L 2 



bounds thanks to Lemma |5.4| . The z-tiles have area larger than one, and due to our goal 
to prove uniform estimates in the Mi, we do not have any control over the area. In the 
Walsh model these tiles correspond to high dimensional subspaces of L 2 (R), and we can 
do little more than considering good L 2 estimates on them. 

We proceed to define the cutoff operators in physical space. Let r\ denote a fixed positive 
function with total L 1 - mass 1 and with Fourier transform supported in [— 2~ 2J , 2 _2J ], 
satisfying the pointwise estimates 

C-\l + \x\)~ n2 < r](x) < C(l + \x\)- N \ (16) 

Here N is the previously chosen constant which controls decay in physical space. 

Let rjj denote the function rjj(x) := 2~ J ^rj{2~ J ^x). For any subset E of R, denote by 
Xe the characteristic function of E and define the smoothed out characteristic function 

Xej by 

XE,j ■■=XE*Vj- 

Note that 

X\i) aeA E a j = 22xE a ,j- (17) 

Here \SaeA-Ea denotes the disjoint union of the E a ; this is the same concept as [j aeA E a 
but is only defined when the E a are disjoint. In particular, if R = \$ a&A E a then 



1 = Yl x £ - 



a£A 



Informally, xe,j is a frequency- localized approximation to xe- In fact we have the point- 
wise estimate 

\XeA*) ~ Xe(x)\ < C(l + 2 J idist(x,dE))- N2+1 , (19) 

where dE is the topological boundary of E; this is easily verified by checking the cases 
x G E and x G H\E separately. 

Now let Po denote the space of all multi-tiles. From ([TH) we have the identity 

/n „ n 

Y[n Q J t (x)dx= Y Xi P ,jp(x)Y[^ P Ji(x)dx. 

It thus suffices to show that 



„ n n 

PeP i=1 i=1 
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Again, by a standard limiting argument it suffices to prove this estimate where Po has 
been replaced by the set of all tiles P G Po such that Ip C [— 2 Jk , 2 Jk ] for some fixed large 
k, provided the constant in the estimate does not depend on k. We shall fix such k and 
denote this subset by Pi. Thus Pi is a finite set of tiles (recall that the set Q of possible 
frequency boxes had been restricted to a finite set). We thus are aiming to show 

„ n n 

I E J xipjptl^M < cUU\\ Pi . (20) 

4. The geometry of tiles and trees 
The following definition establishes an order relation on the set of tiles. 

Definition 4.1. Let P, P' be i-tiles for some 1 < i < n. We say that P < P' if Ip C I p/ 

and ZJp D ZJ P r . If P and P' are two multi-tiles, we say that P < P' if there exists a 
1 < i < n such that Pi < P[. 

Clearly the order of i-tiles is transitive. However, by the good dyadic property of 
Lemma 12. 2L we also have 



Lemma 4.2. The order on multi-tiles is transitive, i.e., P < P' and P' < P" imply 
P < P". 

Proof Assume P ^ P' and P' ^ P" . Then P { < P[ and P'- < P'! imply Q i D Q^ and 
Qj D Qj with strict containment by (|I4"D. By Lemma ^]2| we conclude Q { D Qj and 
Qj D Q, L . This gives P < P" as desired. ■ 

As is standard in the theory, the main argument shall consist in splitting the set Pi into 
smaller subsets, for which the name tree has become standard. We call a dyadic interval 
J- dyadic, if it has length 2 J - 7 for some integer j. If £ e T', 1 < i < n, and / is a J-dyadic 
interval, we define 

\c oMi\ n-l c i o-M»i7i-ll 

^,?,/-=[« _ 2 ' ' '^2 ' ' * 
and 

uj u := [L- 1 ^ - 5001/r 1 , L-\0i + 5001/r 1 ] . 
Observe that the right hand side of the last display does not depend on i, because £ _1 (£) 
is on the diagonal. We shall not need any good dyadic properties for the intervals oJ^j- 

Definition 4.3. Let £ G V , let I be a J- dyadic interval, and let T be a set of multi-tiles. 
The triple (T,^,I) is called a tree, if T is non-empty, Ip C I for each P G T, and for 
each P G T there is a 1 < i < n such that uj^j C Up. . 

We shall write Ui y T and TDt for tOi^i and oJ^i- The data (£, /) are called the top data 
of the tree. 

We will frequently call the set T itself a tree, with the understanding that top data 
(£t, It) are associated to T. If T is a tree, we define the box set to be the set Qt '■= {Qp '■ 
P ET}. For each Q G Qt, we define the support Eq^t of Q to be the set 

E Q , T :=\J{Ip:PeT,Qp = Q}. (21) 
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We say that two trees T, T" are distinct if T and T' have no tiles in common, that is 
T nT" = 0. (We are reserving the term disjointness for a stronger property, that the tiles 
in T and T' do not overlap). 

If we say that one tree T is contained in another tree T', T C T', this simply means 
that the multi-tiles in T are also in T", and does not imply any additional relation between 
the top data (£t,It) and (£t',It')- 

A main observation for trees is that the box set of a tree can be parameterized by the 
frequency parameter. In this sense trees make the connection to Littlewood Paley theory, 
in which frequency boxes are essentially parameterized by their scale. 

Lemma 4.4. IfT is a tree, then for each j there is at most one Q G Qt such that j = jq. 

Proof Let Q,Q' G Qt, then there are i and i' such that Ut <= Qi and Up Q Qi* Thus 
Q i n % ^ 0. This implies Q = Q' or j Q ^ j Q , by (|TJ). ■ 

If T is a tree, we define the scale set 

Jt := {Pj Q ■■ Q e Qt} ■ 

For j G Jr we define Qi to be the Q G Qr for which Jq = j. 

The main tree selection algorithm will consist of a tree selection process as follows: 

Definition 4.5. A tree selection process shall consist of choosing a tree T\ from Pi, then 
choosing a tree T 2 from Pi \7\ and so on. I.e., at the k-th step we choose a tree T^ from 
Pi \ (Ti U • • • U Xfc_i). We shall refer to the trees Tj, as the selected trees. 

For two reasons it will be necessary that at each step these trees be as large as possible. 

Definition 4.6. Consider a subset P of 'Pi and top data (£, J) as in Definition \4-fy Then 
the maximal tree T* in P with top data (£ T *,I T *) = (£, /) is the set of all P G P such 
that Ip C I and u>t Q ^Pi ■ 

A tree selection process is called greedy, if at the k-th step the tree Tk is maximal in 
Pi\(TiU---UT fc _i). 

One of the reasons to run a greedy selection process is to gain a nesting property 
described by the following lemma: 

Lemma 4.7. Let T be one of the selected trees of a greedy tree selection process. If 
Q, Q' e Qt and j Q < j Q >, then E QjT D E Q , >T . 

Proof Suppose under the assumptions of the lemma we had Eq^t ~t> Eq',t, then there 
was a P' G T with Qp, = Q' such that Ip, $2 -Eq,t- Pick any P G T such that Qp = Q, 
and let P" be a translate of P such that Ip, C Ip„. Clearly P" is an element of Pi. 
The multi-tile P" cannot be in the tree T because of Ip, $2 Eq^t, but its geometry would 
qualify it to be in the tree, namely, 1"^ C I T and TDt Q wp« for some i. Thus it must 
have been selected for a tree T" at a previous stage of the selection process. However, 
the geometry of P' qualifies it to be in the same tree T", namely, Ip, C Ip„ C I T „ and 
ZJt" Q &p! because uJt" Q ^p" for some j and Up" C cJ P / by Lemma |2.2| and the fact that 
To pi i and u pi intersect. This gives a contradiction to the maximality of T". ■ 

The nesting property of the above lemma implies the following bound on the cardinality 
of the finite sets 8Eqj T . 
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Lemma 4.8. Let T be a selected tree of a greedy selection process. Then 

Y / 2- J 1#dE QitT <C\I T \ . 

This should be compared with the trivial bound ^8Eqj T < 2 J '|ir|- 
Proof Since Eq 3T is a finite union of intervals, there are two types of points in dEqj T \ 
those that are the left endpoint of connected components in Enj t, and those that are 
right endpoints. Clearly it suffices to prove the bound for left endpoints only. 

For each j G J^ and each left endpoint x of Eqj T consider the interval (x — 2~ J \ x — 
2~ Jjf_1 ). We claim that these intervals are pairwise disjoint. This claim implies the 
conclusion of the lemma because these intervals are contained in 3It- 

To prove the claim, assume (x — 2~ J \ x — 2~ J i~ x ) and (x' — 2~ J i , x — 2~ Jj _1 ) have 
nonempty intersection. If j = j', we necessarily have x = x' because then both x and 
x' are endpoints of dyadic intervals of length 2~ J K Thus we can assume j' < j. Then x 
is contained in the interior of the dyadic interval I' of length 2~ J i with right endpoint 
x' . However, the interior of I' is disjoint from Eqji t and x is contained in Eqj T , a 
contradiction to Lemma |4.7| . This proves the claim. ■ 



We will use Lemma [4.8| to replace the spatial truncations in (|20D by certain variants of 
themselves when we sum over the multi-tiles in a selected tree. 

It will be convenient to replace the sets Eqj ^ by variants Ej which have better regularity 
properties: 

Definition 4.9. Let T be a tree, and let It be the collection of all maximal J- dyadic 
intervals I C I T which have the property that 31 does not contain any of the intervals 
Ip with P G T. For an integer j with 2~ Jj < \I?\ let Ej be the union of all intervals I 
in \t such that \I\ < 2 -Jjf (We emphasize that we have strict inequality here, which will 
make the index j most natural, as we can see for example in the following lemma). For 
an integer j with 2~ J i > \It\ we define Ej =0. 

The sets Ej obviously depend on the tree T, but we suppress this dependence. The 
construction of the sets Ej appears implicitly in |19 . 



Clearly the intervals in 1^ form a partition of It- The nice regularity properties are 
stated in the following lemma: 

Lemma 4.10. Any two neighboring intervals in It differ by at most a factor 2 J in length. 
The set Ej is a union of dyadic intervals of length 2 _Jj and contains Eqj T if j e Jt- 

Proof To prove the first statement, let / and V be two neighboring intervals of It and 
assume / is larger. Let I" be the dyadic interval which contains I' and has 2~ J times the 
length of I. We have to prove I" C I'. However, 31" is contained in 31, und thus does not 
contain any interval Ip with P G T. By maximality of I 1 we have /" C I'. This proves 
the first statement of the lemma. 

To prove the second statement, let / C I T be any J- dyadic interval of length 2 _Jj and 
observe the dichotomy that either I is contained in an interval of It, or / is partitioned 
into intervals of It which are strictly smaller than I. The latter is necessarily the case if 
/ fl Ej is nonempty or if / = Ip for some P G T. This proves the second statement. 
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Lemma 4.11. With the notation as above, if Iq is a J -dyadic interval of length 2~ J ^ a such 
that 3/o H Ej 7^ 0, then there is a multi-tile P G T with \Ip\ < \Iq\ such that Ip C IOIq 

Proof There is a dyadic interval I\ of same length as Iq which is contained in 3/o and 
Ej . By definition of Ej , 3I± contains an Ip for some multi-tile P G T. This together 
with (|T3|) proves the lemma. ■ 



The sets Ej are obviously nesting, hence we have the following analogue of Lemma |Q 
Lemma 4.12. Let T be any tree. Then with the notation as above, 

J22~ JJ #dE 3 <C\I T \ . 

For each j G J, let Qj be the collection of connected components of Ej; thus Qj is a finite 
collection of intervals. For each I G flj, let x\ and x 7 } denote the left and right endpoints 
of I , and let l\ and P denote the intervals 

ji ._ ( x i _ 2~J(J+ ia i)- 1 x l _ 2~ J ^ +mi ^ 2 ) 

P. ■= ( X J + 2- J{j+m ^- 2 ,x r j + 2- J (J+ m »)- 1 ). 

Then the intervals Ij are disjoint as j varies in the integers with 2~ J i < \I?\ and I varies 
in Qj, 

Moreover, if P- is an interval in the above collection, then the distance to the next 

interval Pa, is at least 2~ J ^ +2 \ 
Similar statements hold for the P . 



Proof Most of the proof is exactly as in the proof of Lemma |4.8| , the only new statement 
is the one on the distance between two neighboring intervals. 

Let I 1 , be such an interval. It suffices to show every different interval I'-, with j' > j 
has distance at least 2~ J ^ +2 > from Jj. The case j 1 < j then follows by symmetry. 

The case j = j' is easy, since the distance between /j and P, is a multiple of 2~ Jj . 

Thus consider j' > j. Let I be the dyadic interval of length 2~ J i which contains ij and 
let /' be the dyadic interval of length 2~ J v +1 ' whose left endpoint is equal to the right 
endpoint of /. Clearly / is disjoint from Ej, whereas P is contained in Ej. The crucial 
observation is that P cannot be contained in -Ej+i, because two neighboring intervals in 
It differ by at most a factor 2 J . 

Thus the distance from /' to any point in -Ej+i is larger than 2~ J ^ +1 \ which proves the 
claim. 

■ 

While we will only consider trees selected by a greedy selection process, we remark that 
the sets Ej can be defined and satisfy the above lemmata for arbitrary trees, which do 
not necessarily come from a greedy selection process and thus may not satisfy a nesting 
property as in Lemma [4.7| . 

The regularity properties of the sets Ej will be used to construct phase plane projection 
operators. It is worth mentioning that a similar notion of regularity called convexity was 
used in [^7] to construct phase plane projections in the Walsh setting, although technically 
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the use of this type of convexity to construct phase plane projections is quite different in 
the current paper. 

We introduce the notion of lacunarity in a tree: 

Definition 4.13. An element P G X is called i-lacunary if (£p)i $■ 2wp i; and it is called 
i-non-lacunary if (£r)i £ 2c<jp. If A is a subset of {1, . . . ,n}, then Ta is defined to be 
the set of all elements in X which are i-lacunary for all i G A and i-non-lacunary for all 
i£A. 

Clearly, a tree can be written as the disjoint union of subsets Ta parameterized by all 
subsets AC{l,...,n}. Each of the sets Ta is either empty or again a tree with the same 
top data as X. 

We have 

Lemma 4.14. IfT is a tree and AC{1,..., n}, then for each Q in Q Ta we have Q G Qt 
and E QiTa =E QjT . 

Proof This follows easily from the fact that lacunarity depends only on the frequency 
intervals. ■ 



As a consequence of this lemma, if T is a selected tree as in Lemma [4.7[ , then whenever 
Ta is non-empty and thus a tree, it satisfies the analogues of Lemmata fi7| and |4~8| . 
There is always at least one lacunary index: 

Lemma 4.15. If T is a tree, then Tg is empty. 

Proof If P G X , then L" 1 ^) e 2Q p . Since £ r G V this contradicts pi). ■ 

We return to the second reason for running a greedy selection process. It gives a certain 
strong disjointness property of multi-tiles of selected trees, as described by the following 
lemmata: 

Lemma 4.16. Let T and T' be two selected trees of a greedy selection process and assume 
that T has been selected prior to X". Let P G T and P' G T" be such that we have 

10u Pi n 10up< ^ (22) 

and 

\ujp.\ < \ujp'\ (23) 

for some i. Then Ip, f\ It — 0. 

Proof Assume to get a contradiction that Ip, fl It ^ 0- By size comparison we then 
necessarily have Ip, C I T . 

By ( p3|) and ( |13| ) we have 100|o;p i | < \u)p;\. Hence (^2|) implies 20u;p. C 20u;p' ^ 0, 
which implies cJp C up; (We made a point of not using Lemma |2.2| to conclude this). 

Hence the geometry of the multi-tile P' qualifies it to be in the tree T. But it is not, 
because different selected trees have no multi-tiles in common, so by maximality of T it 
must be an element of a tree that was selected prior to T. This contradicts the assumed 
order of selection of T and X". ■ 

The above lemma requires information about the order in which trees have been se- 
lected. In our application this information will be provided in the form described by the 
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next lemma. If P is a multi-tile, s a real number, £ G R n and 1 < i < n, define the 
interval 

u h,i,s := & + 2 ~*5000Co|wfi|,6 + 2 2 - s 5000C |^|] . 

Lemma 4.17. Let 1 < i < n. Let T be a subset of the set of selected trees of a greedy 
selection process such that T, T' G T and (£t)« > (£r')« ^mp/?/ ^ot T /ias 6een selected 
prior to T". Let s be some real number. Let T,T' be two (not necessarily different) trees 
in T and assume P G T and P' G T' such that 

10wj* n 10cj p/ ^ , (24) 

\ u Pi\ < I^p/I j 

cj+ flwt ^0 . (25) 

Then Ip, fl Jj- = and m particular Ip, fl Jp = and tae trees T and T" are indeed 
different. 

Proof By the previous lemma we only need to prove that T has been selected prior to 
X". However, (p5| ) together with 

100|u>pJ < Itup'l 

implies (£r)i > (6t')»j which in turn implies that T has been selected prior to T". ■ 

Similarly, we can define 

"a, • := [6 - 2 2 - s 5000C |^U J - 2- s 5000C |^|] 



and then have an analoguous lemma to |4.17| which we do not state explicitly 
We shall need another variant of this theme. For a tree T define 

<4z> := [(6r)< + 2-*10|w i)T |, (6r)i + 2 2 - s 10|^, T |] . 

Lemma 4.18. Let 1 < % < n. Let T be a subset of the set of selected trees of a greedy 
selection process such that T, T" G T and (£t)j > (£t')« imply that T has been selected 
prior to T' . 

Let s be some real number. Let To, Ti, T2, T3 G T and assume we have for j = 1,2,3 

lOu^To n 10ui, Ti + , (26) 

\Vi,Tb\ < \Vi,Tj\ , (27) 

<T O)S na;+ r . jS ^0 . (28) 

IfTi,T 2 , T 3 are a// different, then It ± fl Jt 2 D ir 3 = 0. 

Proof 

As before, we can use ( p8j) to conclude that if \It'\ > \It"\, for T',T"; G {Ti,T 2 ,T 3 }, 
then T" has been selected prior to T". Thus we may assume |7tiI > \It 2 \ > |-%| and 
Ti,T 2 ,T 3 have been selected in this order. 

Assume to get a contradiction that It x fl Jr 2 fl Jt 3 7^ 0, then we have by dyadicity 
Jtj D Lr 2 D Lr 3 . Let P be a multi-tile in T\ U T2 U T3 for which |a;pj is minimal. Let 
T G {Ti,T2,T3J be the tree so that P G T. If there is another multi-tile P' for which 
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\ujp'\ = |u;pj, then we conclude by ( |T4] ) that ujpi = Up i . Hence P and P' are in the same 
tree T. 

Now let P' G X" for some T" G \T\,T-2,T^ with \up;\ > \ujp t \. We aim to show P' G T\, 
which will prove the lemma, because then one of the trees Ti and T3 has to be empty, a 
contradiction. 

Observe that by (F2BJ) and Q2"7| ) we have u)% t T Q 20^,^ • By a similar argument for X" we 
have 20dJ iri fl 20^^/ 7^ 0. By assumption on the size of tree tops we have oj^Ti Q 40^^/. 
If lu^pj < |^i,T'|) then this implies Wt x ^ ^t'i which in turn implies P' ETi. 

We may thus assume \oj%,Ti\ — \^i,T'\- Then we can merely conclude Upj fl uJ T i 7^ and 
~&T\ Q 3Up/. By a similar argument with T" in place of T x we have Up/ fl Up 7^ and 
Up/ C 3Up. But Up C oJp. for some index 1 < j '< n. Hence cJp/ C 3o7p.. However, for 
some possibly different index I we have aJp/ C Up/. Hence 3aJp. and oJp/ have non-empty 
intersection, and thus by Lemma ^[2] we have lOcJp C uj p i. However, we have seen before 
Up C 3Up/ and we have 3Up/ C 9uJp.. Hence Up C uJpi, which proves the claim. 



5. A FEW REMARKS ON SMOOTH TRUNCATIONS AND WEIGHTS 

We pause to prove a few lemmata on weight functions and smooth truncations of 
functions, which are best separated from the main string of arguments so as to not slow 
the main argument down by these technical lemmata. 

Given an interval /, we define the approximate cutoff function \i as 

dist(s, I) -t 
X/(x):=(l + — ) . 

We shall need the following lemma, which may be of independent interest. 

Lemma 5.1. Let I be a finite collection of disjoint intervals, and suppose that for each I 
one has an L 2 function f). Then 

\\J2\ I \ 1/2 xifih<c(J2\ I \) 1/2 ^\\fd2. 

Note that this lemma would be automatic from Cauchy-Schwarz if ^ 7 xj was uniformly 
bounded. Unfortunately, this function is only in BMO, so one needs to work a little harder. 
This lemma is a special case of a phase space Bessel inequality fl98|) that we will need in 
the sequel. 

Proof We may assume that the // are real and positive. By estimating \i by a weighted 
sum of xai f° r dyadic A > 1 (as before AI has the same center as I but A times its 
length), it suffices by the triangle inequality to show 

Lemma 5.2. Let I be as above, fi be real and positive, and let A> 1. Then there exists 
an absolute constant M > such that 

11 Yl iji 1/j W/|| 2 < ^ 1/2 (E i j d 1/2 su p iiMb- ( 29 ) 

/ei lei /eI 



Proof Fix A. Let M4 be the best constant M for which (|29|) obtains for all subsets of I 
(in place of I itself) and all //■; our objective is to show that Ma is bounded uniformly in 
A and I. 
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By construction there exists a subset V of I and // such that 

II £ \I\ l/2 XAihh > M A A^C£ \I\)W sup ||M| 2 . 



lei' lev fc 



Squaring this we obtain 

E E<I 7 I V: W/, \J\ 1/2 XAjfj) > M\A(J2 \I\) sup ||/z|| 2 2 . 



I J 



One only has a contribution if J C 5AI or / C 5AJ. By symmetry and positivity we thus 
have 



£ E (i^r^A/^iJi^xA^^Mi^Ei/DsupiiM 



7 J:JC5Af 



Moving the J summation inside the inner product, and using Cauchy-Schwarz and the 
definition of Ma, we may estimate the left-hand side by 

E|/| 1/2 |I//I| 2 M^( £ |J|) 1/2 sup||^|| 2 . 

I J-.JC5AI J 

On the other hand, by disjointness we have J2jjc5Ai \J\ — 5A\I\. Inserting this into the 
previous estimates we obtain M A < CI as desired. ■ 



Lemma 5.1 will be used through the following variant: 



Lemma 5.3. Let I' be an interval and let I be a collection of disjoint intervals such that 
\I\ < l-^'l f or each I G I. For each I G I consider a positive L? function fj. Then 

lei I 

Proof 

We may write 

Xr < Cxr + E 2 ~ X2 k + 1 r\2 k r 

k>0 

By the triangle inequality it suffices to prove for each k > 

II E |/r /2 x/M| 2 <C2 fc /V1 1/2 sup||M| 2 . 

iei-.in2 k + 1 r\2 k r^<i) I 

and an analoguous estimate for the term xr, which is clearly an easy variant of the above. 
Since |/| < \I'\ for all / G I and the intervals / are pairwise disjoint, we have 

E <c2 k \r\ . 

Iel:IC\2 k + l I'\2 k I'^ 

The claim now follows by Lemma |5.1|. ■ 



For any symbol m on R, we let T m be the associated Fourier multiplier. (This is 
consistent with (||) if we lift m from R to {(£, — £) : £ G R} in the obvious manner). 
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Let wbea positive function on R and r > be a number. We say that w is essentially 
constant at scale r if one has 

(1 + fczM)-ioo < c ^ < c-n + l^zAyoo 
r w{y) r 

for all x, y G R. In particular, the weights x" are essentially constant at any scale \I\ or 
less when \a\ < 100. 

We shall need the following weighted version of Bernstein's inequality. 

Lemma 5.4. Let f be a function whose Fourier transform is supported on an interval uj 
of width 0(2 Jj ) for some integer j . Then we have 

\\wf\\ 00 <C2 J ^ 2 \\wf\\ 2 
for all weights w which are essentially constant at scale 2~ J i . 

Proof We can write / = T m f where misa suitable bump function adapted to 2u. From 
the decay of the kernel of T m we thus have the pointwise estimate 

|/(x)| = \T m f{x)\ < C2«f {1 + Jto_ yl)N dv 

and the claim easily follows. ■ 

Let T be a (possibly vector-valued) convolution operator and r > be a number. We 
say that T is essentially local at scale r if the convolution kernel K(x) satisfies the bounds 

\K(x)\<Cr N \x\- N (30) 

for | x | ^> r. 

Lemma 5.5. Let T be a convolution operator which is bounded on L 2 and which is es- 
sentially local at some scale r > 0. Then one has 

\\wTf\\ 2 < C\\wf\\ 2 (31) 

for all weights w which are essentially constant at scale r. 

Proof We can truncate T so that the kernel is supported on the interval {\x\ < Cr}; 
from (|30D it is easy to see that this does not affect the L? boundedness of T or ([H]). The 



claim then follows by partitioning space into intervals of length r and applying the L 2 
boundedness hypothesis to each interval separately. ■ 

6. Phase space norms and the size of a tree 



The general approach to proving an estimate such as ( PUD is to prove the estimate first 
in the easier case when the summation goes only over a tree rather than the whole set 
Pi. The point being that this easier estimate is a matter of standard Littlewood-Paley 
theory without modulation invariance. The top of the tree fixes a frequency, which after 
a modulation we can think of as being the zero frequency in standard Littlewood-Paley 
theory. In our situation this Littlewood-Paley estimate is Proposition |1.3| . 

The second step is to organize the whole set into trees by a greedy selection process, 
so as to sum the tree estimates. In this organization, the notion of size of a tree plays a 
crucial role. In this section we shall introduce this notion. 
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Definition 6.1. In the following definitions 1 < i < n, and fi is an L 2 function. 
If Pi is an i-tile and & G R, we define the semi-norm WfiWpi,^ by 

||/i||p^:=sup||x}°T mn (/,)|| 2 (32) 

mp. 

where mp i ranges over all smooth functions supported on I0u>p i which satisfy the estimates 

i^woi <ie-&r fc 4^ ( 33 ) 

\ojp-\ 

I r % I 

for all £ 6R and < k < N 2 . In particular, m P . vanishes at £,. 
If T is a tree, we define the i-size size;(T) ofT by 



1 






size^T) := (— ^ U\\ PU s T) X /2 + |It|-> sup \\xf T T m Afi)h- (34) 



PeT 
where rrii^T ranges over all smooth functions supported on IOcj^t which satisfy the estimates 

|^m, T (0| < K - (^ H^^ (35) 

\ u i,T\ 

for all^eU andO<k< N 2 . 

IfP is any collection of multi-tiles, we define the maximal size size*(P) ofP to be 

size*(P) := sup sizej(T) (36) 

(T£,I):TCp 

where (T, £, J) ranges over all trees with TCP. 

We remark, that for |£— • £»| ^> \uJp 4 \ we have the following estimates for mp. which are 
stronger than (|33|): 

|^m Pi (0| <C\i- ft|-*(J^l)i-« . (37) 

I ■* % I 



for all < I, k < N 2 /2. These estimates can be obtained from (|33| ) and the support 
condition on mp i by integrating the mp i over its support I times. Thus mp i is forced to 
be rather small if £j is far away from its support. This observation shall however only be 
of technical importance, since in our applications & will always be within Cu>p i for some 
moderate constant C. 

Thus, heuristically, H/iUp^ is the L 2 norm of f\ when restricted to the portion of Pi 
which is away from the frequency £j. The tree size sizej(T) is heuristically the L 2 norm of 
fi when restricted to the region {Jp eT P%U (It x uj^t) in phase space. As a gross caricature, 
one has the very approximate relationship 

size,(T)"^'osc /T (e- 2 ™ ( ^7,) 

although this caricature does not fully capture the phase space localization to T in (133). 
Here we have written osc/(/) for the L 2 mean oscillation given by 



-"^m/i'-j/'i 



2\l/2 



(» 



If P is a tree, the maximal size size*(P) is a strengthened version of the tree size 
sizej(P). The relationship between the two is analoguous to the relationship between the 
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BMO norm on an interval I and the L 2 mean oscillation on that interval /. This analogy 
is particularly accurate when Mj = and (£t)i = 0. 

The freedom to choose / in (^) independently of the set T adds a useful "Hardy- 
Littlewood maximal function" -component to the size definition. For example this is used 
in the following lemma: 

Lemma 6.2. We have 

llxJJlW/OHa < C\\fi\\ PUi < ClIpl^sizeXT) (39) 

for all indices 1 < i < n, trees T , multi-tiles P G T , frequencies ^ G R ; and symbols m Pi 
supported on 10ujp i which obey (^). 
Moreover, 

U^AMh < ClIl^size^T) (40) 

for all indices 1 < i < n, trees T, all i-non-lacunary multi-tiles P G T ' , all J-dyadic 
intervals I with Ip C 10/ and all symbols rriij supported on 5u>i^ Tt j and satisfying 

I3e"V(£)I < \uJi£ T ,i\~ k 
for all^eli and0<k< N 2 . 

Proof We first consider (|39|) . The first inequality is just by definition. By the remark 
just after Definition |6.1| , we only have to prove the second inequality for £$ G lOOcup^ 
Then this inequality follows from (|34]), ( j36|) since T contains the singleton tree {P} with 
top data £', I' where £,' is defined by ^ = (£')j and /' is the J-dyadic interval of length 
2 J |/p| which contains I p. It is easily verified that these top data indeed turn {P} into a 
tree. 

Now we consider fl40|). Observe first of all that |/^| < |/| because both intervals are 
J-dyadic. By translating /, we may as well assume Ip C /. Namely, we have to translate 
/ by at most ten times its length, so that \i stays the same up to some bounded factor. 

We consider the two cases \Ip\ = \I\ and \Ip\ < \I\. Assume first \Ip\ = \I\. 

Observe that by z-non-lacunarity, up. is contained in 5ui^ T j. Hence 5ui^ T j is contained 
in 9up t . Pick £' G V so that (^')j is an endpoint of lOco^. Then 

HxS^(/*)lla<C||/ilk,(€0, 

by definition of the right hand side, because the multiplier rriij is supported in 10o;p i and 
satisfies (|33|), possibly with a constant. Now the claim follows from (p9|), which proves 
([JO]) in the case |/^| = |/|. 

Now assume \Ip\ < \I\. We consider again a singleton tree {P} with top data £, / so 
that £j is an endpoint of 5u>i£ T j. Again, by i-non-lacunarity of P we see that these top 
data indeed turn {P} into a tree. Then the multiplier m it i satisfies (^) with respect to 
this singleton tree as one can easily see, possibly with some constant, and ([40]) follows by 
definition of the tree size. 



Inequality ([40]) will be mainly applied in connection with Lemma [4.11 . 
Note that our definition of size is L 2 -based (but normalized to have an L°° scaling, 
see the proposition below). In principle one can define IP based notions of size by using 
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the IP norm instead of the L 2 norm in (|32|) and the properly adjusted normalizations 
thereafter. In the Mi — case the L 2 -based notion of size is essentially equivalent to the 
L p -based notion thanks to Lemma |5.4| and the fact that the Pi then have area 1. Only if 
n = i this will be guaranteed, which is why this case will have special treatment. 

However when Mj ^> 1 the L 2 based notion of size is not equivalent to an LP based 
notion. In order to make the Bessel inequality (|98| ) work we need L 2 -based sizes, and this 
is one of the reasons for the restriction 2 < p t < oo in Theorem |1.2j . Presumably one 
would need arguments such as those in |26) to remove this restriction. 



We conclude this section with the observation that the size is always controlled by the 
L°° norm. 

Proposition 6.3. For all 1 < i < n and arbitrary functions fi, we have 

size*(Pi) < CH/illoo. (41) 

Proof Fix i. It suffices to show that 

size^T) < CH/ilU 

for all trees T in Pi. 

Fix T. We have to estimate both summands in (|34|). The second summand is immediate 
since T m . T is a universally bounded operator in L°°. 

We consider the first summand in fl3"4"|). By frequency modulation invariance we may 
assume that £t = 0. From (|54"D, ( |52"D it suffices to show that 

Eii^ r ^^)ii2<^i^iii/,iiL 

Per 

whenever m P . is supported on lOu;^ and obeys (|33|) with ^ = 0. 

First suppose that fi vanishes on 31? ■ Then a simple computation following Lemma 



575| shows that 

\\x}iT mPt mi<c\ip\Mnmio 

for all P G T, and the claim follows by summing in P. Thus we may assume that fi is 
supported in 3/^. It then suffices to show that 

Per 
By Khinchin's inequality, we may estimate the left-hand side by 



^"^ ~10t /'-Mil 2 



Per 



tpmT m (fi)\\: 



for a suitable choice of signs ep G {+1,-1}. But then the claim follows since the expres- 
sion inside the norm is simply a pseudo-differential operator of order in the symbol class 
S 1 ' -1 applied to fi, with bounds uniform in T, Mi, mp^ and ep (see |24[]). ■ 

We remark that one can improve the bound from 1 to | Ei \ / \ Ej. \ for some other 1 < k < 
n, if one is willing to remove from E^ the exceptional set where the Hardy-Littlewood 
maximal function of Ei is ^> \Ei\/\Ek\, and also to remove from Pi all multi-tiles whose 
spatial interval is contained in this exceptional set (cf. |2(| Lemma 7.8; see also [26], fl~8|| , 
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fl2|j). Such improved estimates will not be needed here, because we restrict attention to 
the case Pi > 2. 

7. Tree estimates 

To follow the approach outlined at the beginning of Section ||, it is necessary to estimate 
the sum on the left hand side of (|20|) , where the summation over Pi is replaced by a 
summation over a tree, by the various z-sizes of the tree. In doing so we may assume 
that the tree has been selected by a greedy selection process, because all our trees will be 



selected this way. This tree estimate will be Proposition [7[T] below. 

Proposition 7.1. Let T be a selected tree of a greedy selection process, and let fi, . . . , f n 

be test functions on R which satisfy the normalization 

ll/ilU<l (42) 

for all 1 < i < n. Then we have 



„ n n 



(43) 



yn-l 



whenever 8 n = 1, and < 9i, . . . , B n -\ < 1 are su °h that YH=i @t < %, and the implicitly 
used constant N is sufficiently large depending on (8i). 



In spirit this estimate is a version of Proposition |1.3| with additional localization in 
physical space. It would be very tedious to adapt the proof of Proposition [L3] to the 
localized setting, we shall therefore follow a different approach. The idea is to replace the 
fi by functions (phase plane projections) which are very close to fi near the phase plane 
region of the tree, but essentially vanish outside this region. Then we shall apply Propo- 
sition |1.3| to these phase plane projections directly and recover the result of Proposition 
|7.1| from there: the tree sizes on the right hand side of fl4"3"|) will essentially be the suitably 
normalized L p -norms of the phase plane projections. 

The parameters 9{ will be chosen depending on the exponents pi in Theorem |1.2j . In 
the following we shall not write explicitly (#;)-dependence of our constants, in the same 
way as we do not write pj-dependence explicitly. 

In analogy to [|l7j, |j|, p(| one might expect a bound of \It\ YYi=i s^KT) on the 



right-hand side of (^3|). This bound is achievable in the non-degenerate case m, = 0(1) 
however, in general only the n th size size*(T) can be recovered with a full power 8 n = 1. 
This gain in the case % = n will be crucial in the rest of the paper. One should probably 
be able to obtain the endpoint Yl7=i 9i = 2 oi this result, but we shall not attempt to do 
so here. 



We will prove Proposition |7J] in this and the next section. We remark that knowledge 
of the proof of ([43]) will not be needed in the later sections. 

Proof Fix T, 9{, fi. We shall exploit scale invariance to reduce to the case \It\ = 1. By 
a frequency modulation leaving T and I"" invariant, we may as well assume £t = 0. 

By the triangle inequality it suffices to prove estimate (fi3D where the summation goes 
over the subset T& instead of T for any subset A C {1, . . . ,n}. Fix such A and let B 
denote the set of non-lacunary indices, i.e., {1, . . . , n} = A l±l B. We may assume Ta 



is non-empty and thus a tree with top data (£t,It)- By Lemma |4.15| we have that A 



UNIFORM ESTIMATES ON MULTI-LINEAR OPERATORS WITH MODULATION SYMMETRY 23 

is non-empty. Changing notation we may as well assume that all multi-tiles in T have 
lacunarity type A and thus T = T&. Recall that T& also satisfies the crucial nesting 
property of Lemma |4~7 . 

Let J denote the set J := {Jq : Q G Qt}; its smallest element is greater or equal 0. 
Recall that by Lemma [O] the map Q *—>■ Jq is one-to-one. 

From (|TT|) we may estimate the left-hand side of (fJ3|) as 



El/fclpi/d (44) 

jeJ J t=l 

where the spatial cutoff Xj an d Fourier cutoff jtj are defined as 

& := Xe qjt j = J2 XhJ ( 45 ) 

PeT:Qp=Qi 

and 



7T,- := TT n j 





for j G J. In particular, Xj h &s Fourier support in the region {£ : |£| < 2 J: '~ J }. We 
remark that our notation is sloppy here, the operator jfj also depends on the parameter 
i. We will always write tcj in combination with a function f], and the omitted index is 
always the one of the function /j. 

Let 2 < pi < oo be chosen so that pi < 2/9i for 1 < i < n — 1 and J2i=i ^IVi — lj 
this can be done by the hypotheses on 9 stated in Proposition \lA\ . We observe the simple 
bounds 

Lemma 7.2. For alll < i < n, j G J, and intervals I of length \I\ = 2~ Jj we nave 

\\x] /2n *jfi\\L H i) < C\I\ l l^\ze*{Tt. 
Proof By interpolation and Proposition |0| it suffices to prove the bounds 

llx} /2B *i/i||x>W < C|/r/ 2 si Z e*(T), 

II ~l/2n ~ n || ^ /~t 

\\Xj ^jji\\L^(i) < c, 



\x/ 2n njfi\\L°°(i) < Csize*(T). 



and (in the i = n case only) 

The second estimate is immediate from the boundedness of the /j, while the third follows 
from the first and Lemma |5.4j since \v n \ = 1 and thus m n = 0. Thus it suffices to prove 
the first inequality. 

Fix i, j, I. From ([f5|) we see that there exists P G T with \Ip\ = \I\ such that we have 
the pointwise estimate 

U*) 1/2n < cx% 

on J. It thus suffices to show that 



p 
This however was observed in (I 



xfjjfiWv < Cl/l^sizeKT). 
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From this Lemma and Holder we have 



/It lb 

lJoII^I^ c 'll^ll^«l J III 8i2ie i*( T ) fll 
i=\ 1=1 



for all j 6 J and \I\ = 2 J . Summing in /, we see that we may bound each summand in 
([44]) by the right-hand side of (f43|). Thus the main difficulty is to obtain summability in 

3- 

In the non-degenerate case nij = 0(1) this summability is obtained by noting there 
must be at least two lacunary indices^ in A, estimating those indices in L 2 in space (and 
I 2 in j), and taking all other indices in L°°. See e.g. [17], [IS|, J21J and the discussion in 
the introduction of J2TJ. This however is not feasible in the general case for two reasons. 
Firstly, there might only be one lacunary index; and secondly, one can only get good L°° 
bounds for the i index when m, = 0(1). Thus we will have to invoke Proposition [03 as 
outlined before. 

We shall need to replace Xj m (BD by a product of cutoff functions. If Xj was a 
characteristic function, we could simply write it as a power of itself. However, it is an 
approximate characteristic function. From (|T9[) we have: 

M*) ~ Xe^J < 0(1 + 2 J Mist(x, dE Q i iT ))- N2+1 . (46) 

The right hand side can be handled by Lemma [4.8| . 

Lemma 7.3. Let 1 < k < n, then 

/n n 

fe--^)n^i<^i/Tin size *( T ) 9! - 
jtu »=1 i=l 

Proof By the triangle inequality, we can estimate this sum by 



/n 
ifc-xjini*tf 



By Holder and Lemma |7.2|, this is bounded by 



a 






On the other hand, from (|46"D and the lower bound in (|16D we have 

1^1 II (Xj ~ X k j)xJ h \\l~(i) < C J( 1 + 2 Ji dist(x, dE Qi , T ))- N . 
Summing this in / and evaluating the integral, we estimate the previous by 






and the claim follows from Lemma 4.8 



2 This is due to the fact that the cubes lOQi must intersect V in order to have a non-zero contribution 

to (in. 
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Instead of estimating (H), it suffices by the above Lemma to to show that 



/n n 

Ux^M<C\I T \l[size;(T) e >. 
,-_,) i=l i=l 



( 47 ) 

To do this we shall now introduce the crucial tree projection operators. 

Proposition 7.4. Let the notation and assumptions be as above. Then for each 1 < i < n 
there exists a function IL^/j) with the following properties. 

• (Control by size) We have 

||II,(/«)|| w <C|/ r | 1 /»8izef(!r) fli . (48) 

• (Tlj(/j) approximates fi on T: lacunary case) For all i G A and j G J, we have 

XjKjfi = S j+mi IU(fi) (49) 

where Sj +mi is a suitable Littlewood-Paley projection to the frequency region {£ : 
± 2 J(j+m t ) < |£| < 5000C 2 J ^' +m ')}. 

• (Tli(fi) approximates fi on T: non-lacunary case) For all i G B, jo G J, and all 
intervals Iq of length 2~ J i°, we have the bounds 

WxT^M ~ Mfi))\Wi ) < Csize*(T) e >\I \ 1/p *- 1 J xlHo (50) 

where fij is the function 

/*,•(*) := Y,2- IJ '- JU10 ° E (1 + 2 JJ > - y\r W ° (51) 



and the sets Ej have been defined in Definition \j.£\ . Also, for all 1 < i < n, j G J, 

and intervals I of length \I\ = 2~ Jj we have 

\\x) ,2r ^Mm\ L n {I) < CI/l^azeJCT)*. (52) 

One can construct the rij to be linear operators, but we shall not use this. 

Roughly speaking, Hi(fi) is the projection of fi to the region {Jp eT Pi of phase space. 
Although such a description can easily be made rigorous in a Walsh model, and is not 
too difficult in a lacunary Fourier model, it is substantially more delicate in the Fourier 
setting in the non-lacunary case. 

Note that, the function fij can be controlled by Lemma [4.12 . 

We prove this rather technical Proposition in Section |8|. For now, we see why this 
proposition, combined with Proposition fL3| , gives fl4"3|). 

To exploit the fact that Hi(fi) approximates fi on T, we use fl49|) and the triangle 
inequality to estimate fl4"7|) by the main term 

E I I i\[x^fi){\[x^Mfi))\ (53) 

jeJ J i<=A ieB 

plus j^B error terms of the form 

Ei/w(/io-n«o(/«o))( n Witt n xwUiifi)] (54) 

jeJ i&A Or i<i ieB:i>i 
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where io G B. 

Let us first control the contribution of a term fl54l) . Fix i Q . By the triangle inequality 
we may estimate fl54|) by 



e e \xjKj(fi -iii (fi ))\( n ix#i/ti) n i^J n i(/i)i- 

jGJ 7:|7|=2- J ^ i£^ Or «<i ieB:i>i 

By Holder, we may estimate the previous by 

E E llx#i(/i - n io(/io))IL^o(z)X 

j€J T:|J|=2- J J 

x II \\xftjfi\\ L n{i) Yl \\Xj^i(fi)\\Ln(i). (55) 

ieA Or i<io i£B:i>io 

The first factor we estimate using (|50"D . The second group of factors we estimate using 
Lemma \!.2\ For the last group we use ( j5^) . Combining all these estimates and using (|J) , 
we see that we may estimate (|)5|) by 



jeJ I:\I\=2- J 3 i=l 

which of course sums to 



n „ 

c e e (ri size *( T ) 91 )/^ 



'ft r. 

C(Y[size*(Tf)J2vr 
i=i j J 



Expanding out fij, we may estimate this by 



n „ 

c(II size *( T ) 9l )E E /(i + 2 Jj i 



|x-i/|)- 100 rfx; 
computing the integral, we thus obtain 



C([[size*(T) 9i )J2 2 ' JJ * d ^ 
»=i j 

and the claim follows from Lemma 14.12. 



It remains to estimate fl53|) . By repeating the proof of Lemma [7l| (but using (52) in 
place of Lemma |7.2| when i E B) it suffices to estimate 

Ei An^i^di^ 11 ^))! 

jGJ ^ iGA ieB 

which we rewrite using ([49]) as 



ei /(n^+m^^))^^^^))!. 



By Proposition |1.3| , we may estimate this by 

n 

cUma 



i)\\Vii 

1=1 
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and the claim follows from (fig). Observe that in applying Proposition |1.3| we have used 
A^0, which had been a consequence of Lemma |4.15| . 



This concludes the proof of Proposition |7.1| , except for the proof of Proposition [T74 
which shall be done in the next section. ■ 

8. Proof of Proposition |7T1| 

We now prove Proposition |7.4j . The proof of this Proposition is rather involved, and the 
techniques used here are not needed elsewhere in the paper; readers who are interested 



in the general shape of the proof of Theorem |1.2| may wish to skip this section on a first 
reading. 

We begin with the lacunary case i 6 A, which is substantially easier. 

Fix % e A. We define Ui(fi) as 

IIi(/t):=X)x#i/« (56) 

in this case. From the Fourier support of Xj an d TTj we see that ( |49"D is obeyed. It remains 
to show (|8|). 

By interpolation and Proposition |6.3| it suffices to prove the estimates 

||IIi(/i)|| 2 <C7|/ T | 1 / 2 sizer(T) (57) 

HIWOIIbmo^C (58) 

with the additional estimate 

||lW i )llBMO<C'size*(T) (59) 

when i = n. Here we read BMO as J- dyadic BMO (see the proof below). 



We first prove (|57|). By orthogonality and ( [45]) we have 

I|n*(/<)ll2 = £ll E xipj^jfiWl 



jeJ PeT:Qp=Qi 



Since we have the pointwise bound 

E xi»t ^ c 



P&T:Q p -- 



we can bound the previous using Cauchy-Schwarz by 

cj: e mxi^i i/a v. 

5 
But by (|39|) this is bounded by 



2 
]Jt\\2- 



jeJ p eT ,Q p=Q j 



cJ2 E WMko 

jeJ p eT .Q_. =Q j 

and the claim follows from ([34]). 

The claim (|58|) follows immediately from the observation that the linear operator IT 
is a pseudo-differential operator of order in the symbol class S ,1,_1 , and therefore maps 
L°° to BMO (see e.g. [0]), so it remains to show (|59|) . 
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We need to show that 

osc / (II n (/ n )) < Csi<(T) (60) 

for all J-dyadic intervals J, which is what we mean by J- dyadic BMO. 
Fix /. We expand the left-hand side of (|60|) using ( f45|) as 



OS Cl (J2xipJp^ P Jn)- 



PeT 

First consider the contribution of the coarse scales, when \Ip\ > \I\. By the sub-linearity 
of osc/Q, it suffices to show that 

Y osc I (xi P ,j P ^ Pn fn)<Csize* n (T). (61) 

PeT:\I p \>\I\ 

We use the easily verified Poincare inequality 

oscK/^Cd/lllV/l 2 ) 1 / 2 

The Fourier multipliers \Ip\W7r^ and 7r Wp have symbols adapted to I0up n which vanish 
at the origin, whereas Xi~,j- an d Ip^Xi-,j- are dominated by xj°°- By the previous and 
(J39l), we thus have 

OS Cl (xi p , jp ^ P Jn) < C^{1 + diSt ^ J p) )-100 si<(T)- 

\ 1 p\ \ 1 p\ 

Summing over all P such that \Ip\ > \I\ we see that this contribution is acceptable. 
It remains to consider the contribution of the fine scales, i.e. those P in the set 

T 7 :={PGT: \Ip\ < \I\}. 

For this contribution we will shall use the estimate 

osc/(/) ^Cj—^Wfxijjh, 
where 2 J i' := |/|. It thus suffices to show that 

II E Xi, 3l Xip,p^ P Jnh < C|/| 1/2 size;(T). (62) 

PgTj 

For fixed jp, the expression inside the norm has Fourier support in the region |£| ~ 2 J ip 
(lacunary supports). By orthogonality we may thus estimate the left-hand side of ( |6~^) by 

C ( Y I' E Xl,3iXlp,3p^ P Jn\\l) l/2 - 

jeJ:j>jj p eTrjp=j 

For fixed jp = j, the xi~,j-( x ) are uniformly summable in Ip and x. By Cauchy-Schwarz 
we may thus estimate the previous by 

jeJ:j>ji p eTl :jp=j 
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On the other hand, from (091) we see that 



1/2 ~ f I, ^ nn . dist(/,/^ 100 



II 1 / z ~ t II ^ n(-\ i \ ' P \ — lUUll i- n 
\\Xl,hXl p , jp ^ Pn fn\\2 < C(l + |yj ) ||/«||p n ,0- 

Thus we may estimate the left-hand side of fl62f ) by 



ni ST^ n i S H ' fg2 \-200i| t \\i \iii 
U V / „ V 1 '' ijj J H/nllPn.oJ • 



PGTj 

We now break up this as a sum over sub-trees of T. Let T| denote those multi-tiles in 
Tj for which the interval Ip is maximal. Clearly the Ip are disjoint as P varies over T}. 
We can thus rewrite the previous as 



dist{l ,lp) ^_ 200u „ ||2 y 2 
p n ,o) 



c(E E (i + ^jTp^)- 200 !!/" 

P'eT* PeT-.ipCip, 
Since |i^,| < |J|, we can estimate this by 



c( ^ (1 + dist(J^) r200 ^ ||/b| 

P'GT* P<=T:I p CI p/ 



2 Nl/2 
Pn,0J ' 



For each P' G T}, the collection {P G T : Ip C Ip,} is a subtree of T with top data 
(£, J^,). By (|36| ) we can thus estimate the previous by 

P'eT} 
Since the Ip are disjoint and of size < |/|, we can bound this by 

200 J_\l/2 



Csize:(T)(J(l + ^^)-™dx) 



and the claim follows. 

It remains to handle the more difficult non-lacunary case. Fix i G B. 

For each real number j, let Tj be a Fourier multiplier (defined, say, by dilations of a fixed 
multiplier) whose symbol is adapted to the frequency region {|^| < 2 2+J i} ) whose symbol 
equals 1 for {|^| < 2 1+J - 7 }, and let Sj be the associated Littlewood-Paley projections 
Sj := Tj — P/-1- We may assume that the kernels of Tj and Sj are real and even. 

In (|56|), one used lacunary Fourier multipliers followed by smooth spatial cutoffs to 
construct IIj(/j). In the non-lacunary case these smooth spatial cutoffs are not desirable 
as they interfere^] with the ability to decompose non-lacunary multipliers as a telescoping 
series of lacunary ones. To avoid this difficulty we are forced to use rough spatial cutoffs 
instead. 

A first guess as to the construction of Ui(fi) would be 



3 Of course, one could try to control this error with commutator estimates, however one does not seem 
able to recover the crucial 2~w • J '°l' 10 ° decay in ( pl| ) by this approach. 
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where for each x G Eq we define the integer- valued function j(x) by 

j(x) := max{0 < j : x G Ej} 
One can expand U-i(fi) as a telescoping series: 

ftiC/i) = XE T ^Ji + Yl XEjSj+mJi- (63) 

i<i 
This proposed projection turns out to obey (0), but does not obey ( |50D due to the poor 
frequency localization properties of the characteristic functions Xe- m ©)• Specifically, 
the cutoffs destroy the vanishing moments of the Sj+ mi fi, and this will cause a difficulty 
when trying to sum in j because the projection Kj is non-lacunary. 

To get around this problem we shall modify each term of Tli(fi) (except for the first 
term Xe T mi fi) to have a zero mean. In order that these modifications do not collide with 
each other, we shall place them in disjoint intervals. 



We recall the intervals IJ and /j and the collections Qj introduced in Lemma |4.12|. Let 



and (p r j • be bump functions adapted to l\, H with total mass 



'I,j <*"" Wj 



bl I,3= / 07 J =2- J °' +m °. 



For each j > 1 and I G Qj, decompose xi as Xi = H\ + Hj, where H\(x) := H{x — x\) 
and H r j(x) := —H(x — x r I ) are shifted Heaviside functions. 

For each j > 1 and I G Qj, define the quantities c\ • and c r j • by 



^ . . = 2 J0 +mi ) J H i iSj+mifi , (64) 

r IJ :=2 J V + ^Jir i S j+mi f i . 



c 

A basic estimate on c z 7j - is 
Lemma 8.1. Let j > and JeOj. Taen we nave iae estimate 



I /j I - J (l + 2 J 0'+ m *)|a;-a;' 

In particular, we have 



(65) 



and 



Proof 

From (6¥) we may write 



l + 2 J (J'+ m *)|x-x^|) 100 ' 
4 j | < C2 Jm ' /2 size*(T) (66) 

14,1 < a (67) 



c /,j — 2 ' ' / {oj +mi H I )bj +m j i . 



Here 5j +mi is a Littlewood Paley projection whose multiplier is supported in 5u it T, is 
constant 1 for £ G 4^^ \ 2J~ 1 u i> T, and vanishes on J~ 1 L0i t T- The claim (|65| ) then follows 
by using repeated integration by parts to obtain pointwise bounds on Sj +mi H\. 
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There exists a J-dyadic interval I' of length 2~ Jj whose left endpoint coincides with x\, 
because Ej and hence / is a union of dyadic intervals of length 2~ J i by Lemma 4.10 . By 



Lemma |4.11| there is a tile P G T with Ip C 10/'. 



The bounds ©, (|67|) then follow from ©, ©, ©. 
We can now define the corrected projections IIi(/i) as 

ik(fi) : = fi(/o - E E teX* + c h^h)- (® 



We first verify (|4q) . We first observe that 

||IIi(/0lloo<C. (69) 



Indeed, from (f%2"|) we see that Tj 0+mi fi and Hj(/j) are bounded. The remaining terms of 
(|69|) then follow from fl6?|) and the disjointness of the 4> l jj, and similarly for c^j and 0J ■. 
In light of fl69"|) it suffices by interpolation and Proposition ^T3] to prove ([57]), together 
with the bound 

\\n(f i )\\ 00 <C2 J ^ 2 size*(T) . (70) 

In fact, to prove ( f48|) we only need the bound ([T(]) for i = n, but the general case will be 
useful later. 

We now prove (|70|) . From fl66|) the contribution of the Cjj is acceptable. Similarly for 
Cf „• . Thus it remains to show that 

\\UUi)\\oo<C2 J ^l 2 siz4(T), 

or in other words that 

\T j(x)+m J l (x)\<C2 Jm >/ 2 s\ze:(T) 

for all x E Eq. 

Fix x, and define j := j(a;). From the definition of j(x) there exists a J-dyadic interval 
I' C £Jj of length 2~ J - 7 which contains x. It thus suffices to show that 

||7i +mi /ilU~(i') < ^2 Jm '/ 2 size*(T). 

By Lemma |4.11| there is a multi-tile P G T with Ip C 10/'. Hence the desired estimate 
follows from (^0) and Lemma |5. | . 

This completes the proof of (|7D|). 

It remains to prove (0). From (§H), the triangle inequality, and the disjointness and 
size of the <f) l j • (and of the (j)} •) it suffices to show that 

\\fL(fi)h<C\I T \ 1/2 8hel{T) (71) 

and 

E E l4,/2- J °' +m<) ) 1/2 < Cl/rl^size^r) (72) 

o<i /efij 

together with a similar bound for the c r j •. 

The bound ([72]) follows from (|66|) and Lemma [4. 12| . 
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To prove (|7l|) we expand 



°<3 °<J InEj\E j+1 ^%:\I\=2~ J i 

As j and / vary in the above sum, the sets / fl Ej\Ej + i are pairwise disjoint, hence it 
satisfies to show 

J2 J2 WTi+mJiWlm) < C|/T|size*(T) 2 

0<3 InEj\E j+1 ^$:\I\=2- J i 

For each j and / in this sum there is a dyadic interval I' C I of length 2~ J |/| which is 



contained in Ej\Ej + i. This follows from Lemma 4.10 . As j and / vary, these intervals I' 



are pairwise disjoint. Hence it suffices to show that 

\\T j+m Ji\\ L 2 {I) < C\I\^ 2 size*(T) 

for all j, I in the above sum. But for such j, J we can find a multi-tile P G T with 
Ip C 10/ by Lemma [OX The claim then follows from fl4"D|). This proves ( |7T| ) and thus 



(|57j). The proof of (^) is now complete. 

The estimate (52) will follow from (|50D , Lemma |7.2| , the triangle inequality, and the 
fact that the fij are uniformly bounded. Thus it only remains to verify (|5T]|). 

Fix jo > and Jo such that |io| = 2 -7 - 70 . From the frequency support of ttj we may 
replace fi — Ui(fi) with Tj 0+m Ji — Hi(fi). By interpolation and Proposition |T3] it suffices 
to show the bounds 

\\x/ 2n ^jo( T 3o+^ifi ~ ni(/<))|U«(j ) < C^M / X? ^o (73) 

Mo I ■/ 

and 

Hxi /aB *jb(2j64™,/« - n^C/,))^^) < Csizejmi/ol-^y ^^ (74) 

with the additional bound 

WxJi^^joiTjo+mJi - ni(/t))IU<»(jb) < Csize *i( T )-rn / x 2 i Vh (75) 

l J o| J 

when i = n. 

We first show (|73|) . For this estimate the only bound we use on fi is fl42[) . 

First suppose that J is outside £^- . Then the claim follows from (|42]) , (]69|) , the decay 
of Xjo an d the estimate 



1 + 2 J ^dist(I ,dE J0 ))- N < C±- [xlfi j0 . 

\ I o\ J 



(76) 
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It remains to consider the case when Iq is inside Ej . In this case the cutoff (Xj Q ) 1 ^ 2n is 
useless and will be discarded. We now decompose 

Tj + mi fi — hi i ( fi ) =X~$l\E Jo Tjo+rrn fi (77) 

-XR^ifi) (78) 

+xr\%, E J2 cl iA,i (79) 



J J0 

i<i<io /en 



+xr^ E E^« ( 8 °) 



J J0 

i<i<io /ef2j 



- 53 Yl H i s ^ *f* - c iAo ( 81 ) 

j <j ieUj 

-JJ]flIWi-%- (82) 

io<j /en., 

From (|42D and the separation between 7 an d TR\Ej and the decay of the kernel of jtj 
we can bound the contribution of (77) by ([7(f) as desired. The terms (78), (79), (80) are 
similar, although (79), (80) use ( |67|) instead of (|42|) . 

Now consider (81). The idea is to interact the smoothing properties of 7f JO with the 
moment property 

J H\S i+mi fi - c\J hi = (83) 

coming from the construction of the c\ •. 

By the triangle inequality and the definition of fij it suffices to show that 

ii**(^w< - ^jiu-do) < c2-&-*>/*»2- j 0-*>(i + dist ^°; 4) )- 30 

for all j > jo and / G IX,. 

Fix j, /. We first assume Xj G 3/o, in which case we may replace the right-hand side by 
C2-(i-i°y ioo 2- J( - j - jo) . Let K(x) denote the convolution kernel of vr io . By (||) it suffices 
to show that 



M) 



(K(x -x)- K(x - ^{HlSj+^Mx) - c\J ItJ {x)) dx\ < C2-^'°I/ioo 2 -J&--jo) 

for all Xq G Iq. 

Fix xq G Jo- The contribution of djj<j>ij can be controlled using flB7|), the fact that the 
support of 0j • is within a distance of (72 -J ^ +mi ^ from x\, and the bound 

|if(xo ~ a:) - #(^o - x\)\ < C2 2J ^ 0+lcai) \x - x\\. (85) 

To deal with the H\Sj +m Ji term we rewrite it as 

(K(x - x) - K(x - Xj))([#j, S j+m .]fi(x)) dx\ 
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with the commutator [H l j, Sj +mi ] of H\ and Sj +mi . Here we have used that 7ij Sj +mi = 
and the kernel of Sj +mi has mean zero. However, we have the easily verified estimate 

\{HlS J+m Mx)\ <C(l + 2 J ^\x-x l I ))- 200 \\f l \\ oo . 

The desired bound for the H\Sj +mi fi term now follows again from (|85|) . 

It remains to consider the case x\ ^ 3 Jo- This is done similarly to the previous case, 
however using instead of (^) the kernel bound 

\K{x -x)- K{x - 4)| < C2 2J ^ +m <)(l + 2 Jj \x - 4|)- 200 |a; - x\\ 

for 2\x — x l j\ <\x — x \. 

The treatment of (82) is similar. This concludes the proof of (|73|) in all cases. 

The inequality flT5| ) follows from ([73|) and Lemma |Oj. Thus it only remains to show 

Now we show ([T3|). This will be a reprise of the proof of ([7^), except that we shall rely 
on (|0|) instead of fl32|). 



We turn to the details. We shall only concern ourselves with the case when 5 Jo intersects 
Ej . The case when 5 Jo is disjoint from Ej is done with similar arguments as those below; 
one loses some powers of the separation between I and Ej whenever one applies (|39"D , 
but this is more than compensated for by the decay of Xj , which also gives the additional 
factor of (|76|) . We omit the details. 

Discarding the cutoff Xj , we reduce to 

||^ ( T io+mJi - lii(fi))\\ L \i ) < Csize*(T)\I \ 1/2 / xj Vjo- 
Since Iq is near Ej , we can find a multi-tile P e T such that Ip C 10 Iq by Lemma 



4.11| . From (^) we have 

Wxfjjo+^fih < C\I \ 1/2 size:(T). (86) 

Decompose T jo+m Ji - Tl^fo) as (77) - (78) + (79) + (80) - (81) - (82) as before. 
First consider the contributions (77), (78), (79), (80) which come from outside Ej . Let 
us first examine the non-local portion of these contributions, or more precisely 

ll^o(XR\3/o(( 77 ) - ( 78 ) + ( 79 ) + (80)))1U 2( / 0) . 
From the rapid decay of the kernel of tcj we may estimate this by 

2 -iooj mi ||~ioo ((77) _ (7g) + (7Q) + (g0))) || 2 _ 

But each of these terms is acceptable thanks to (|70|) (or more precisely, the arguments 



used to prove fl70|) but applied to (77), (78), (79), (80) rather than 11* (/*)). 
It remains to estimate the local contribution 

lfe(X3/ ((77) - (78) + (79) + (80)))|| L2(/o) . 

Of course, these contributions are non-zero only when 3/o is not contained in Ej , so 
that Iq is near the boundary of Ej . We can then discard the projection 7f, and the [i- 
factor on the right hand side and reduce to showing that 

||(77) - (78) + (79) + (80)|| L2(37o) < C|/o| 1/2 size*(T). 

The contribution of (77) is acceptable by (|86|). 
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Now consider (78). The set (Eq\Ej ) fl 3/o is the union of at most three intervals I± of 
length 2 _Jjf °. On each of these intervals ii, the function j(x) is equal to jo ~~ 1 by Lemma 



4.10 



For each I\. the claim then follows from flS6|) , Lemma |5.5| , and the identity Tj _i +mi = 
T T 

Now consider (79). By Lemma [4.12| , there are a bounded number of functions Z 7 • with 

j < jo which have support near 3/o- Also, since Iq is near the boundary of Ej , we have 
j > j — 2 for all these functions </4 ■. Thus it suffices to show 

| c y 2 J(-;- mi )/2 < c|/ r /2 size*(T) 

for each of the coefficients c 7 • involved. 

From ( p5| ) and Cauchy-Schwarz we see that 

|J I < ^r,J0'+™<)/2||v 10 <7. f.|L 

Since jo — 2 < j < j we may replace Jo on the right hand side by the J-dyadic interval 
of length 2~ Jj which contains Jo- The desired estimate follows now from Lemma |4.11| and 

©■ 

The treatment of (80) is similar to (79), which concludes the discussion of the contri- 
butions (77), (78), (79), (80) which come from outside Ej . 

We turn to (81). As with the corresponding treatment of (81) in (|73|), we shall extract 
a gain by interacting the smoothing of ttj with the moment condition (|83|) . 

By fl5i"D and the triangle inequality^ it suffices to show that 

l|frjo(ffi$Wi - 4M\\*to) ^ C'|/ r 1/2 size*(T) y ^ o 2-l^ol/ioo (1 + 2 J 3{x _ ^ jj-ioo 
for all jo < j and I eQj, Evaluating the integral, this becomes 

IlirjbCff&W, - 4,0^)11^(70) < Csize*(T)2^/ 22 -^2-^^)/ 100 x/o(4) 2 - 
Fix j, /. Observe from Fourier support considerations that 

Kjo((Tj+rn i -lH I )Sj+ la .fi) = 0; 

in particular, (Tj +mi _iH l j)Sj +mi fi has mean zero. It thus suffices to show that 

W^oFjjAlHIo) < Csize*(T)2^/2 2 -^ 2 -0-^)/ 10 ^io(4) 2 
where 

-F},/,* := [(1 - ^• +m ._i)F / ]5' J - +mi /j - Cjjfijj. 
From the construction of c\. we see that Fjj^ has mean zero, and thus has a primitive 
V^ 1 Fjj ji which goes to zero at ±oo. We thus may write 

WtT- F t •Ilr2/r ^ = 2 J ^ 0+mi ) II 9" J 0'"+ m ') V7T • CV _1 F- r -nlr2/r ^ 

Ir'jo^j.-f.Mli 2 ^) ^ 11^ v/ 'jcA v ^j./.iJHL 2 ^)- 

The multiplier 2~ J ^' 0+m ^V7fj is bounded in L 2 and is essentially local at scale 2~ J ^ 0+mi \ 
and hence at scale 2 _J - ?0 . Thus we may estimate the previous by 

2 J(i0+mi) ||x}J V- 1 J F i , /ii || 2 . (87) 

4 The use of the triangle inequality is a little inefficient, costing a factor of 2~ J ^~^ '' 2 or so, because 
it does not exploit orthogonality in physical space. However the mean zero condition gives us a gain of 
2 _ J u— io) 5 so we s ti}i enc i U p w ith the improvement of 2~ W— jo)/ioo a t the end. 
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We now claim the pointwise bound 

\Fj,i,i(x)\ < Csize*(T)2 Jm °/ 2 (l + 2 J ^ j+m ^\x - x\\y™. (88) 

Assuming (|88|) for the moment, we may integrate it (using the mean zero condition to 
integrate from either +00 or —00, depending on which one gives the more favorable 
estimate) to obtain 

iV-^vr.ifa)! < Csize*(T)2- J(j+mi) 2 Jm ' /2 (l + 2 J{j+m ' ) \x - 4|)" 49 . 
We can thus estimate fl87l) by 

C2 jyo+m i )^ /o ( a .^100 size *^ 2 -jy+m i ) 2 Jm i /2 2 -jy+m i )/2 ? 

which is acceptable. 

It remains to show (]88f) . The contribution of c\ A\ ,- is acceptable from (|66| ), so it 



remains to consider [(1 — Tj +Tai ^i)H l I ]Sj +rn Ji. From repeated integration by parts we 
have the pointwise estimate 

|(1 - T j+mi _ x )H\{x)\ < C{\ + 2 J ^ +m ^\x - xW)- 100 

so it suffices to show that 

\S j+mi fi{x)\ < C2 Jm ^ 2 (l + 2 J ( j+m ^\x - x\\f°. 

Let I' be a dyadic interval of length 2~ J i which is adjacent to the left endpoint of /. By 
Lemma [4. 11| we can find a multi-tile P' ET with Ip C 10/'. From ( ^0|) we thus have 

\\x}?S j+mi f i \\ 2 <C2- J ^size*(T). 



From Lemma |5.4| we thus have 

llx^WilU < C2^/ 2 size*(T) 

and the desired bound follows. This completes the treatment of (81). 

The treatment of (82) is similar to (81). This completes the proof of (0), and therefore 



of (0). The proof of Proposition |7.4j is thus (finally!) complete. 



9. Deducing Theorem [L2] from Proposition [7Tl| 
In this section we state standard Propositions which will allow us to deduce (p0|) and 



hence Theorem [L2] from Proposition |7.1| . 

The idea is to break the multi-tile set Pi into trees T, such that one has control on 
the i-sizes size*(T) and on the total tree width Y2t \^t\- This will be accomplished using 
Proposition |6.3| and the counting function estimate of Proposition |972| on trees of a given 
size. 

The selection of the trees is done by a greedy selection process, which will be defined 
in various steps. We need the following definition: 

Definition 9.1. Call a tree convex, if it is a selected tree in a greedy selection process. 



In particular, convex trees satisfy Lemma \4-% Call a subset P C P x convex, if it is of 



the form Pi \ (T\ U ■ ■ ■ U T&) where T 1; . . . , T^ are the selected trees of a greedy selection 
process. 
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Proposition 9.2. Let 1 < i < n, m G Z, and suppose that P is a convex collection of 
multi-tiles such that 

size*(P) < UhT 1 ' 2 . (89) 

Then there exists a collection T of distinct convex trees in P such that 

J2 \It\ < C2~ m (90) 

TeT 

and the remainder set P' := P — i+J TeT T is convex and satisfies 

size*(P') < ll/.H^- 1 )/ 2 . (91) 



We prove this Proposition in Section [TI| it is the main step in our tree selection al- 
ghorithm. 

We now aim to prove (^0|). By varying the Pi slightly and using Marcinkiewicz inter- 
polation |T5| it suffices to prove this estimate under the assumption that fi = \Ei are 
characteristic functions. 

Starting with m large and working downward, applying Proposition |9.2| for each 1 < 
i < n for each m, we obtain 

Corollary 9.3. For every integer m there exists a collection T m of distinct convex trees 
in Pi such that we have the size estimate 

size*(T m ) < \Ei\^ 2 2 m / 2 (92) 

for all 1 < i < n and m G Z, the total tree width estimate 

J2 i j ^i < C2 ~ m ( 93 ) 

TeT m 
for all m G Z, and the partitioning 

Pi = P 2 U l+| 1+) T. (94) 

m£Z TGT m 

where P2 is a subset of Pi with size*(P2) = for all 1 < i < n. 



In the i — n case we apply Proposition |0|, ( |9~^) , and the fact that f n is a characteristic 
function to obtain 

size;(T m ) < Cmin(2 m / 2 | J E; n | 1 / 2 , 1) (95) 

This is in fact true for all 1 < i < n, but we shall only exploit it for i = n. 
Applying (|9~3|) we may estimate the left-hand side of (120) by 



(Observe that the set P2 gives no contribution, e.g. by an appropriate application of 
Proposition tree-est-prop.) 
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We may apply Proposition pM| with 9i := 2/pi for 1 < % < n — 1, and estimate the 
previous expression by 

n-l 

C J2 E l^|(n size *( T ) 2M )size;(T). 

m£Z TeT m i=l 

By (|9"2"P, ( P5] ) and then (|9"BD, we may estimate this by 

n-l 

C Y, 2" m ( JJ(2 m/2 |E i | 1/2 ) 2/ft ) min(2 m/2 |£„| 1/2 , 1). 

m£Z i=l 

By (|j) this simplifies to 

n— 1 

^(11 i^r M ) E min ( 2m/2 ~ m/Pn i E ™i 1/2 ' 2 ~ m/p ")- 

i=l m€Z 

Performing the m summation we obtain the desired estimate 



„ n n 

PePi i=1 i=1 



and conclude Theorem |1.2| . 



It remains only to prove Proposition 9.2 



10. Proof of Proposition 9.2 



Fix 1 < i < n, fi and P. We shall need to split our notion of size size*(P) into upper 
and lower components. 

For any £, and any sign ±, let Hf denote the Riesz projection to the half-line {£ G 
R : ±(£ — £j) > 0} in frequency space. This Riesz projection is a linear combination of a 
modulated Hilbert transform and the identity. 

Observe that 

ll/j||Pi,e« ~ \\ H iiJi\\Pi& + ll#£/*l|Pi,& 
for any tile Pi and any £j. Thus if we define 

>ize, ± (T) := (-L £ \\H^ T)i fi\\ 2 Pi ^ T)i ) 1/2 + \It\~* sup Wf^T^Hf^f^, 



T\ r m.,T ' ( 96 ) 



size 

l/v-l 

Per 

where ra^T is a multiplier in the range defined in Definition |6.1| , and 

size* ± (P) := sup sizei ; ±(T), 

(T,£,I):TCP 

then we have 

size*(P) ~ size* + (P) + size* _(P). 
Proposition |9.2| will then follow from a finite number of applications of 



Proposition 10.1. Let P be a convex collection of multi-tiles. Let ± be a sign, and let 
m e Z be such that 

size* ± (P) < \\hh2 m ' 2 . (97) 
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Then there exists a collection T of distinct convex trees in P such that 

J2\ J t\< C2- m (98) 

TeT 

and the remainder set P' := P — i+J reT -^ ^ s convex an d satisfies 



size 



i,± 



;p') < WfiW&w*. (99) 



Estimate ( p8|) is a variant of Bessel's inequality, expressing that the distinct trees in 
this proposition correspond to almost orthogonal components of fi. 
Proof We shall prove Proposition |1(J.1| for the sign +; the other sign follows by applying 



the frequency reflection £ — > — £ and conjugating fi. 

The idea is to remove maximal trees in a greedy selection process from P until (B9) is 



obeyed for the remainder set. This procedure shall be given by iteration. If (BUf) holds 



we terminate the iteration. If ( |99D does not hold, then there exists a tree for which 

size, + (T)>||/ l || 2 2( m - 1 )/ 2 . (100) 

Since the set of all possible trees (T, £,/) obeying (|100|) is compact, we may select T 
so that (£r)i is maximal. By retaining the top data but adding further multi-tiles if 
necessary, we may assume that T is maximal in the sense of Definition [4.6|. We then add 



this tree T to T. Then, we remove all the multi-tiles in T from P. We then repeat this 
iteration until ( |9"5| ) holds. 

Since P is finite, this procedure halts in finite time and yields a collection T of mutually 
distinct convex trees. Note that trees with a larger value of £r will be selected before trees 



with a smaller value of £y. The property (99]) holds by construction, so it only remains 
to show (HI). 

As usual, we shall use the TT* method to prove this orthogonality estimate. One may 
think of this Lemma as a phase space version of Lemma |5.1|, which was set entirely in 



physical space, and we shall need Lemma 5.1 in the proof of this estimate. 



Write X : = size* + (P). Observe that 

X/2 < size ij+ (T) < X (101) 

for all T G T by construction. 

It suffices to prove (|98"D separately for the set of all T e T which satisfy 

(|^El^(i),^H^ T )J 1/2 >^/4 (102) 

Per 



and the set of all T G T which satisfy 

10 T IT-, 

'fob 



h\-* sup \\x? T T mi JH± /0|| a > X/i (103) 



for appropriate m^T- We first consider the set of trees which satisfy (|102|) . For simplicity 
of notation we may assume this set is equal to T. 

From ( |101| ), (|96|), fl32| ) we may associate to each T G T and P G T a multiplier mp i 
supported on the interval 

u± := {£ G I0u p : £ > (£ T ),} 



40 CAMIL MUSCALU, TERENCE TAO, AND CHRISTOPH THIELE 

obeying fl33|) with & = (£t)j such that 

J2c 2 p ~X 2 \I T \, (104) 

Per 
where cp is the non-negative quantity 

_ II ~10rp f II 

C P : ~ IIXj^mp.Ji||2- 

From the signed version of (|39| ) we have 

Cp - < CXl/pl 1 / 2 . (105) 

Summing ( |104j ) in T, we obtain 

TeT p eT TGT 

On the other hand, from the definition of op and duality we can find for each P an 
L 2 -normalized function gp such that 

cp= (fi,T mp .(x}° P 9p))- 
We thus have 

reT p eT TeT 

We can write the left-hand side as 

(f*> Yl E c P Tm p, (Xi° P 9p))- 

TGT p eT 

By the Cauchy-Schwarz inequality we thus have 

x 2 J2 \ j t\ < cum Y,Y, c P T ^x%gp)h. 

TeT TeT p eT 

To prove (|98|) it suffices to show that 

ii E T, c P T ™rM°p9 P )h < cx(j2 \W 2 - 

TeT p gT TeT 

It will be necessary to dyadically decompose the operator T mp around the base frequency 
(£r)i- For each P e T e T, decompose 



where TOp i|S is a bump function adapted to the region 



<4> , : = t(6r)i + 2- s 5000C |u;p|, (£ T )* + 2 2 -*5000Q,M 



and supported in lOc^p. 
We can then decompose 



T mPi (Xi° P 9p) = J2 2 ~ Sh 



s=0 
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where 

hpy-=T mPi , a {xf p 9p)- (106) 

Thus hp a has Fourier support in tut . , is bounded in L 2 , and is rather weakly localized 
in physical space near I p. For s < Mj the multiplier T mp g has good spatial localization 
properties, but for s > M; the multiplier T mp begins to spread hs . along a wider interval 
than I p. However in the case s > Mi we have the easily verified pointwise estimate 

\hp it {x)\ < C2>- M *\Ip\- 1 ' 2 X2-» iIp (x) 5 (107) 

from kernel bounds on T mp a . 

By the triangle inequality it thus suffices to show that 

II £ cphpJ 2 <CXV/™(J2\lT\) l/2 

PeTeT TeT 

for all s > 0. 

Fix s. We square this as 

E E w<^ lS , **,.) < c^ 2 2 s/50 E i^i- 

PeTeT P'eT'eT T e T 

By symmetry it suffices to show that 

E E w|(/^> p ^|<CX 2 2 s / 50 E|/t|. (108) 

PeTeT P'eT'eT:|tjjr|<|u> # ,| r e T 

We first consider the contribution when \up\ = \u)p,\. It suffices to show that 



E cpcp,\(h PyS ,hp, s )\<CT^ E 

PeTeT.P'eT'eT: |wg|=|w s,|=2 Jm PeTeT: |w«|=2 Jm 



pi 

for all integers m, since the claim then follows by summing in m and applying ( |104| ). 

Fix m. By Schur's test (i.e. estimating cpcp, < \{(?p + cp,)) and symmetry it suffices 
to show that 

E K^,>p\ s >i<C2 s / 50 

P'GT'&T:\u>p,\=\wp\ 

for all P E T E T. 

Fix P. From (|H]) we conclude that {hp s , hp, ) 7^ implies P = P'. For those values 
one has a bound of 

\(hp tS ,hp,J\ < C2- , ^- M ^(l + 2 J ^P + ^- M ^dist(Ip,Ip l )r 2 , 

as can be easily verified from (|106|) when s < Mj and (|107|) when s > M*. The claim then 
follows by summing. 

It remains to control the contribution when \up\ < \ujp,\. It suffices to show that 

E E wvi^wi^^ic ^'^^' )- 

T'ETm-:|/,,KH S l ' *" (109) 
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for all T G T and P G T, since we may then sum in P to obtain 

E E E i^i 1/2 i^r /2 K^,>p' )S )i < c2 s/5o n, 

PerT'eTp/g^.i^^i^i 

and (|108|) follows by summing in T and applying (|105|) . 

It remains to show ( |109| ). Fix P, T. Let P(P, s) denote the set of all P' which make a 
non-zero contribution to ( |109| ). We now make the key observation that by Lemma [4.17 
we have intervals Ip, are disjoint from It- Namely, non-vanishing of (hp s ,hp, ) implies 



hypotheses (E3p and Q2H). Moreover, by similar reasoning, if P' and P" are in P(P, s) 
and P' 7^ P", then J^, and Ip„ are disjoint, as one can see from (|14D if \Ip\ = \Ip,\ and 
from (a slight variant of) Lemma f4.17| if \Ip,\ ^ \Ip„\- For future reference we summarize: 

Observation 10.2. The intervals Ip> are disjoint from each other and from It- 

We first verify ( |109| ) in the case s > Mi . In this case we see from ( |107| ) and a calculation 
that 

\(h- h- \\ < C1\T~ |-V2|rj-i/2 9 -(«-itfi) / v 3 
\\n-p n p , )\^u\i pl \ \i p \ a ; X2 s - M n s 

Jix, p 



la' 1 



./RUt p j p 



so by Observation |10.2| the inequality ( |109| ) reduces to 

dist(2( 8 - M< )/p,R\/ r ) o 
'R\/ T ' " Jp ' ' ' l J p| 

which is easily verified. 

Now consider the case s < Mi- By ( |106| ) we have 

where 

^-xL 10 sup |t; pl T mPi Mf P gp)\. 

P':\Ip,\<\Ip\ 

From the decay of the kernel of T m , T mp a we can control F pointwise by the Hardy- 

P i ,S i,* 

Littlewood maximal function: 

F(x) < CMgp(x). 
From the previous (|109|) reduces to 



P'eP{P,s) 



<"' (110) 



so by Cauchy-Schwarz and the Hardy-Littlewood maximal inequality it suffices to show 
that 

iiv 10 V^ ir-i 1 / 2 -? 10 in-ill <r\r i 1/2 n i dist ( J P' R \ Jr K -2 

WXip Z^ |J ^'' Xi pi \9p'\\\2S^\lp\ UH jjq ) ■ 

P'eP(P,s) p 

Let us first consider the portion of the L 2 norm in the region 

Q := {x G I T : dist(x, R\J T ) > \Ip\ + -dist{Ip, U\I T )}- 
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In this region we have from the L 2 boundedness of gp, and the decay of x} . that 
II I T 1 1/2 ~io | _ mi <? n\T |-i/2 f n i dist(o;, Q) 5 

II I Ip* T XlAdpi I ||jy»(n) < G Mp| / (H r=r-j — j dx 

Jip, \ 1 p\ 

(in fact one can get much better bounds than this, especially if \Ip,\ <C \Ip\) so by Cauchy- 
Schwarz and the above key observation (pairwise disjointness of Ip, and disjointness from 
It) again this contribution to ( |11U| ) is bounded by 

c\i P \-v* [ (i + ^^-rux 

Jn\i T \ip\ 

which is acceptable. 

It remains to estimate the contribution outside of Q. Since we have 

\\xi P xxi\J™< c ( 1 + nq ) > 

l J pl 

this follows simply from Lemma |5.3| . 

This completes the proof of fl98|) for the trees which satisfy ( |102| ). 

Now we consider the set of trees which satisfy ( |103| ). For simplicity of notation we may 



again assume this set is equal to T. The proof is a reprise of the previous case. 
We may associate to each T e T a multiplier m ijT supported on the interval 

wjr := {£ E l(K r : f > (fr),;} 

obeying (|35| ) such that 

4~x 2 |/ T | (in) 

where cp is the non-negative quantity 

|| -IOt-t f || 

C T '■— \\Xl T 1 m iiT Ji\\2- 

By duality we can find for each T G T an L 2 -normalized function g? such that 

ct = {h,T miT (x¥ T gT))- 

We thus have 

^c T (/„T mi , T (x}»)~X 2 ^|/ T |. 

TeT TeT 

By the Cauchy-Schwarz inequality we have 

X 2 J2 \ j t\ < C\\fih\\ E C ^ T — (x}° T 9T)h. 
TeT TeT 

To prove ( P5| ) it thus suffices to show that 

|| J>T miiT (^»|| 2 < CX(^ |/ T |) 1/2 . 

TGT TeT 

We dyadically decompose the operator T m . T around the base frequency (£t)i- 



T = \^2~ S T 

±m i,T / j ± rn i ; 



s=0 
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where m^T, s is a bump function adapted to the region 

<t, s : = [(6r)i + 2-'10|w iiT |, (£r)< + 2 2 - s 10|^, T |] 

and supported in IOo^t- 
Define 

hr tS ^T mitTtS (xT T 9T). (112) 

By the triangle inequality it thus suffices to show that 

||^c T / iT , s || 2 <CX2^(^|/ T |) 1 / 2 

TGT TgT 

for all s > 0. (While a better power of s can be achieved, we shall not be ambitious here 
to do so.) Fix s. By squaring and using symmetry it suffices to show that 

J2 E CTCT'\{hT, s ,h T ', s )\ <CX 2 2 s J2\h\. 

TeTT / eT:|o)i,T|<|w iT /| TGT 



Using ( p.ll| ) we see that it suffices to prove for each T e T 



J2 \lT>\ 1/2 \(h T ,s,h TI , s }\<C¥\I T \ 1 2. 

T'eT:|wi, T |<K iT /| 

Fix T. Observe 

(h T ,s,h T ',s) = (T^. Tl T miTs x} T gT,x} T ,9T') 
and the pointwise bound 

\T^ Tl T mitTta xJ T gT{x)\ < Cxl({s-M i)+ ) lT {x)Mg T {x) 

with the Hardy Littlewood maximal function Mgx- The latter followed from the kernel 
bound 



\K{x,y)\ < Cllrl-^-^^il + IItI^-^-^Ix - y\Y 



100 



for the kernel K of the operator T^ , T m . Ts . 

Let T(T) be the set of all T' G T such that Ic^rl < |^i,r'| and {h T>sl h T i^ s ) ^ 0. By the 
above and the Hardy Littlewood maximal theorem it suffices to show 

II V^ \T l 1 /2,~ / 5 -10 n II < ro ((s-Mi) + )\ j 1 1/2 

II 2^ ItT'i ' x 2 ((s-M i ) + ) lT xi T ,gT>\\2<L<^ y Im 

T'eT(T) 



This however follows from Lemma |5.3| provided we can show T(T) can be split into two 
subsets, each of which has the property that the intervals It> with T' in the subset are 
pairwise disjoint. This however follows from Lemma |4.18| . 



This completes the proof of (98) for the trees which satisfy (|103|) 
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